INDEX
Explanations
references to ownership or possession
New Auto-Interp
Negative Logits
yourselves
-0.68
conmigo
-0.67
comigo
-0.61
المعيارى
-0.58
Himself
-0.57
själva
-0.56
himself
-0.54
correctly
-0.54
irse
-0.54
herself
-0.53
POSITIVE LOGITS
sterious
0.84
riad
0.83
apologies
0.79
anmar
0.78
thic
0.73
rrh
0.72
opic
0.71
own
0.71
favorite
0.69
konos
0.69
Activations Density 0.142%