INDEX
    Explanations

    references to assistance or help-seeking behaviors

    New Auto-Interp
    Negative Logits
     shrinks
    -0.79
    плек
    -0.75
    lujah
    -0.74
     scared
    -0.73
    Everybody
    -0.71
     thinks
    -0.71
     girls
    -0.70
    lasyon
    -0.70
     boss
    -0.69
     people
    -0.68
    POSITIVE LOGITS
     utilizing
    1.01
     Дан
    1.00
     utilising
    0.98
     אשר
    0.97
    Дан
    0.96
     poichè
    0.95
     می‌باشد
    0.94
     данного
    0.91
     alábbi
    0.90
     tevens
    0.89
    Act Density 1.220%

    No Known Activations