INDEX
Explanations
words indicating certainty or immediacy of events
New Auto-Interp
Negative Logits
ooter
-0.17
Scho
-0.15
heimer
-0.15
ascus
-0.15
ë£Į
-0.14
èĢĥ
-0.14
nika
-0.14
schop
-0.14
ULSE
-0.14
åħ´
-0.14
POSITIVE LOGITS
sÃłng
0.16
Ej
0.14
be
0.14
íļį
0.13
dispens
0.13
ken
0.13
669
0.13
ouro
0.13
cruise
0.13
onec
0.13
Activations Density 0.126%