INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    (connection
    -0.08
    likelihood
    -0.07
    Officials
    -0.07
     жест
    -0.07
    mia
    -0.07
    _PEER
    -0.06
    separator
    -0.06
     Terrorism
    -0.06
     дозвол
    -0.06
     reconcile
    -0.06
    POSITIVE LOGITS
    :bold
    0.06
     Industrial
    0.06
     retard
    0.06
    σου
    0.06
    ेटर
    0.06
    /#
    0.06
     Rescue
    0.06
    ution
    0.06
     широк
    0.06
    ışı
    0.06
    Act Density 0.000%

    No Known Activations