INDEX
    Explanations

    phrases expressing criticism or negative judgment

    New Auto-Interp
    Negative Logits
     Voi
    -0.57
     Võ
    -0.55
     fince
    -0.54
     {$\
    -0.53
    AppCompatTheme
    -0.50
     Hæ
    -0.49
    ecera
    -0.49
     whofe
    -0.49
     traverser
    -0.49
     zoll
    -0.48
    POSITIVE LOGITS
     necessarily
    0.84
    necessarily
    0.61
     not
    0.59
     NOT
    0.50
     ļ
    0.50
     consultato
    0.49
    not
    0.48
     merely
    0.48
     unlike
    0.48
     necesariamente
    0.48
    Act Density 0.099%

    No Known Activations