INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     om
    -0.07
    _extended
    -0.06
     neby
    -0.06
    kové
    -0.06
     seront
    -0.06
    SCO
    -0.06
     authoritarian
    -0.06
    μέ
    -0.06
     azi
    -0.06
     oltre
    -0.06
    POSITIVE LOGITS
    amples
    0.07
    λι
    0.06
    \Context
    0.06
     asleep
    0.06
     Sly
    0.06
    .compat
    0.06
    edis
    0.06
    enis
    0.06
    quist
    0.06
     оборуд
    0.06
    Act Density 0.001%

    No Known Activations