INDEX
Explanations
conditional statements and indicators of necessity or obligation
New Auto-Interp
Negative Logits
olut
-0.16
ay
-0.15
éal
-0.15
wick
-0.14
aning
-0.14
éc
-0.14
LETTE
-0.14
--
-0.14
óm
-0.14
mine
-0.14
POSITIVE LOGITS
ullan
0.15
atos
0.15
aldo
0.14
ovice
0.14
Caller
0.14
IVEN
0.14
riteln
0.13
Fur
0.13
èĩº
0.13
egin
0.13
Activations Density 0.006%