INDEX
Explanations
possessive pronouns indicating ownership or association
New Auto-Interp
Negative Logits
inne
-0.15
itoris
-0.15
pery
-0.14
-footer
-0.14
-urlencoded
-0.14
ÑĤÑĢÑĥ
-0.14
kir
-0.14
jt
-0.13
(*((
-0.13
deaux
-0.13
POSITIVE LOGITS
own
0.29
Own
0.21
Own
0.21
OWN
0.20
ehler
0.17
_own
0.16
own
0.16
próp
0.16
ataka
0.16
SEL
0.15
Activations Density 0.231%