INDEX
Explanations
discussions of personal experiences, often with emotional or controversial implications
New Auto-Interp
Negative Logits
deres
-0.90
ourselves
-0.80
彼らは
-0.79
yourselves
-0.74
deras
-0.71
eorum
-0.67
loro
-0.65
kanilang
-0.65
unison
-0.60
seamnă
-0.60
POSITIVE LOGITS
himself
1.86
his
1.82
himself
1.49
his
1.31
seinem
1.16
seiner
1.12
kanyang
1.04
seines
1.03
dirinya
1.00
Himself
0.98
Activations Density 1.236%