INDEX
Explanations
possessive pronouns and possessive forms of nouns
New Auto-Interp
Negative Logits
stal
-0.17
uri
-0.15
елиÑĩ
-0.15
ongs
-0.14
anzi
-0.14
abar
-0.14
è¹
-0.14
Burk
-0.13
sovere
-0.13
tones
-0.13
POSITIVE LOGITS
itage
0.14
ito
0.14
parents
0.13
äºĪ
0.13
itos
0.13
_Parse
0.13
Sweep
0.13
ê²°
0.13
arse
0.13
ennon
0.13
Activations Density 0.291%