INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     is
    -1.24
     I
    -1.20
     (
    -1.13
     these
    -1.11
    ;");
    -1.06
    ,
    -1.02
     we
    -1.01
     if
    -1.00
     your
    -0.99
    +*
    -0.98
    POSITIVE LOGITS
    いいですね
    1.34
    harmed
    1.33
    ollary
    1.26
     desactivar
    1.24
    turnstile
    1.20
    butuhkan
    1.18
     сообщил
    1.17
     telefonu
    1.17
     chiare
    1.16
     eccell
    1.15
    Act Density 0.147%

    No Known Activations