INDEX
    Explanations

    phrases indicating negation or the absence of something

    New Auto-Interp
    Negative Logits
    aud
    -0.16
    _DEFINED
    -0.15
    iÃŁ
    -0.15
    islav
    -0.15
    inski
    -0.14
    etta
    -0.14
    antz
    -0.14
    Äįer
    -0.14
     FOREIGN
    -0.14
    eut
    -0.14
    POSITIVE LOGITS
    ÑĢож
    0.17
    .ActionListener
    0.17
     back
    0.17
    orda
    0.15
     yet
    0.15
     jus
    0.15
     Disc
    0.14
     jadx
    0.14
    ãĤ¿ãĥ«
    0.14
     fen
    0.14
    Act Density 0.118%

    No Known Activations