INDEX
Explanations
occurrences of the pronoun "I"
i and I variants, especially at start of phrase
New Auto-Interp
Negative Logits
enfans
-0.71
uſed
-0.68
ſta
-0.67
raiſ
-0.65
themſelves
-0.65
unſ
-0.62
harusnya
-0.61
ſte
-0.60
juſ
-0.60
ſever
-0.59
POSITIVE LOGITS
i
0.91
in
0.68
I
0.66
I
0.66
i
0.65
In
0.62
в
0.57
في
0.53
ใน
0.49
i
0.47
Activations Density 0.000%