INDEX
    Explanations

    technical terms and code-related language

    New Auto-Interp
    Negative Logits
     kasarigan
    -0.82
     كومونز
    -0.82
     étoient
    -0.80
     indígen
    -0.78
     avoient
    -0.75
     wikipagina
    -0.75
     ſtate
    -0.75
     pleaſure
    -0.74
     auroit
    -0.73
     ब्रेकडाउन
    -0.72
    POSITIVE LOGITS
    1
    0.64
    2
    0.62
     ,
    0.61
    7
    0.59
    4
    0.59
    ↵↵
    0.58
    ↵↵↵
    0.57
    8
    0.57
    3
    0.57
    5
    0.57
    Act Density 7.620%

    No Known Activations