INDEX
Explanations
various forms of the word "to be" and associated linguistic nuances
New Auto-Interp
Negative Logits
imir
-0.17
andle
-0.15
andles
-0.15
annel
-0.15
ANDLE
-0.15
apor
-0.14
abyrin
-0.14
iod
-0.14
å®
-0.14
asury
-0.14
POSITIVE LOGITS
olon
0.22
Rolling
0.19
rolling
0.18
Ke
0.18
airo
0.17
Ke
0.17
llib
0.15
OLON
0.15
rolling
0.15
KE
0.15
Activations Density 0.027%