INDEX
Explanations
the pronoun "they" and its variations
New Auto-Interp
Negative Logits
onor
-0.58
Conci
-0.57
Salat
-0.56
Grün
-0.54
Parke
-0.54
Hins
-0.54
fuer
-0.54
ín
-0.53
Coeur
-0.53
mín
-0.53
POSITIVE LOGITS
they
1.79
THEY
1.71
They
1.68
they
1.65
They
1.61
THEY
1.57
he
1.37
she
1.20
mereka
1.13
He
1.13
Activations Density 0.134%