INDEX
Explanations
pronouns related to individuals, particularly focusing on possessive pronouns
New Auto-Interp
Negative Logits
odash
-0.17
ư
-0.17
igans
-0.15
deen
-0.15
tread
-0.15
اÛĮر
-0.15
uld
-0.14
Independent
-0.14
à¤Ĩध
-0.14
еле
-0.13
POSITIVE LOGITS
inton
0.16
arrass
0.15
iting
0.14
Copyright
0.14
igits
0.14
Ãłnh
0.14
627
0.14
imators
0.14
عرب
0.13
ADVISED
0.13
Activations Density 0.116%