INDEX
    Explanations

    phrases containing directives or requests

    New Auto-Interp
    Negative Logits
     Gött
    -0.73
     ++)
    -0.62
     моск
    -0.61
     gând
    -0.60
     Москвы
    -0.59
     Ruto
    -0.59
    or
    -0.57
    ly
    -0.57
    ity
    -0.57
     què
    -0.57
    POSITIVE LOGITS
    К
    0.80
     на
    0.79
    lapsible
    0.74
    MLLoader
    0.73
     ال
    0.73
     в
    0.72
     за
    0.72
     FormsModule
    0.72
    В
    0.71
     ל
    0.70
    Act Density 0.009%

    No Known Activations