INDEX
Explanations
modal verbs, particularly "can," indicating ability or possibility
New Auto-Interp
Negative Logits
aversable
-0.15
orsi
-0.15
acus
-0.15
optera
-0.15
793
-0.15
üst
-0.14
xfff
-0.14
ardown
-0.14
åģ
-0.14
erm
-0.14
POSITIVE LOGITS
arded
0.17
sole
0.15
Howe
0.15
2
0.15
drives
0.15
abra
0.15
otope
0.14
Clo
0.14
not
0.14
missing
0.14
Activations Density 0.112%