INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     antib
    -0.08
     impos
    -0.08
     reforms
    -0.08
     impose
    -0.08
     abat
    -0.08
    -0.07
    -0.07
     battling
    -0.07
     berubah
    -0.07
    -0.07
    POSITIVE LOGITS
    favicon
    0.10
     оформление
    0.09
     Nemo
    0.08
    paper
    0.08
    -paper
    0.08
     cheesecake
    0.08
    .ico
    0.08
     useless
    0.08
     spe
    0.07
    ින
    0.07
    Act Density 0.001%

    No Known Activations