INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Gun
    -0.07
     наруж
    -0.06
    '=
    -0.06
    iddled
    -0.06
    ΟΝ
    -0.06
     box
    -0.06
    antas
    -0.06
    _slave
    -0.06
     hungry
    -0.06
    ’B
    -0.06
    POSITIVE LOGITS
    لة
    0.07
    jíž
    0.07
     teachings
    0.06
    latex
    0.06
    likleri
    0.06
     rostlin
    0.06
    -cigaret
    0.06
    imageView
    0.06
    .Classes
    0.06
     Edited
    0.06
    Act Density 0.000%

    No Known Activations