INDEX
Explanations
personal pronouns and references to individuals
subject pronouns
New Auto-Interp
Negative Logits
azas
-0.54
RegressionTest
-0.51
uable
-0.49
Datuak
-0.48
lapping
-0.48
انجليز
-0.47
Stampa
-0.46
tainment
-0.46
TType
-0.46
Potential
-0.46
POSITIVE LOGITS
он
0.84
Он
0.79
она
0.71
Она
0.68
Она
0.68
Он
0.68
оно
0.66
Оно
0.63
мы
0.63
він
0.61
Activations Density 0.002%