INDEX
Explanations
expressions indicating action or change in state
New Auto-Interp
Negative Logits
292
-0.16
757
-0.15
ceiver
-0.15
phans
-0.14
uy
-0.14
skyt
-0.13
lech
-0.13
nuts
-0.13
ernote
-0.13
entials
-0.13
POSITIVE LOGITS
ednou
0.20
oad
0.16
Benson
0.15
840
0.14
HAS
0.14
cka
0.14
0.14
Pear
0.13
Factory
0.13
Farr
0.13
Activations Density 0.406%