INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ringing
    -0.07
    _init
    -0.06
    167
    -0.06
    .tpl
    -0.06
     зан
    -0.06
     altercation
    -0.06
     желуд
    -0.06
    pack
    -0.06
     puzzle
    -0.06
     seek
    -0.06
    POSITIVE LOGITS
     allowing
    0.09
     allowed
    0.08
     mạng
    0.07
    0.07
     allow
    0.06
     permit
    0.06
     itibar
    0.06
    держ
    0.06
     longest
    0.06
     xc
    0.06
    Act Density 0.028%

    No Known Activations