INDEX
Explanations
instances of the verb "be" in various forms
New Auto-Interp
Negative Logits
kel
-0.15
uess
-0.15
Bian
-0.14
å¼ı
-0.14
eka
-0.14
iego
-0.14
ÑĤакое
-0.14
addon
-0.14
bars
-0.14
esse
-0.13
POSITIVE LOGITS
gay
0.17
ToF
0.16
spl
0.15
YRO
0.15
ÑĢоÑģÑĤ
0.15
isay
0.15
dra
0.14
sip
0.14
sted
0.14
bum
0.14
Activations Density 0.030%