INDEX
Explanations
common English function words and phrases conveying basic actions or states
New Auto-Interp
Negative Logits
æı
-0.15
odem
-0.14
umpy
-0.14
LAY
-0.14
Ä
-0.14
наÑĢ
-0.13
\Lib
-0.13
Desk
-0.13
\common
-0.13
cip
-0.13
POSITIVE LOGITS
ignon
0.19
inton
0.18
uma
0.17
voie
0.15
iers
0.14
uku
0.14
ency
0.14
¸
0.14
anism
0.14
irth
0.14
Activations Density 0.002%