INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     anni
    -0.07
     anticipate
    -0.07
    ofil
    -0.07
     oral
    -0.07
    ॉल
    -0.07
     Tek
    -0.07
    ILL
    -0.06
    onis
    -0.06
    кового
    -0.06
    ERAL
    -0.06
    POSITIVE LOGITS
     unit
    0.10
     units
    0.07
     Unit
    0.07
    μενη
    0.06
    cargo
    0.06
     UNIT
    0.06
     radiation
    0.06
     зави
    0.06
     Padding
    0.06
     SUCCESS
    0.06
    Act Density 0.015%

    No Known Activations