INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     koş
    -0.07
     паль
    -0.07
    :void
    -0.06
     Engineer
    -0.06
     eliminated
    -0.06
    aporan
    -0.06
     HUD
    -0.06
     іншого
    -0.06
     RIP
    -0.06
     Hin
    -0.06
    POSITIVE LOGITS
     ant
    0.07
    /format
    0.07
    _lens
    0.07
    ("----------------
    0.07
    671
    0.06
    ("--------------------------------
    0.06
    font
    0.06
     Supern
    0.06
     implement
    0.06
     Sem
    0.06
    Act Density 0.155%

    No Known Activations