INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    oma
    -0.07
    ID
    -0.07
     продолж
    -0.07
     Kang
    -0.06
    -0.06
    iptables
    -0.06
    [input
    -0.06
     Fighting
    -0.06
    .release
    -0.06
     hallway
    -0.06
    POSITIVE LOGITS
     sensitive
    0.08
    0.07
    سي
    0.07
     возника
    0.07
    entlich
    0.07
     жен
    0.06
     ekip
    0.06
    stress
    0.06
     due
    0.06
     cuisine
    0.06
    Act Density 0.012%

    No Known Activations