INDEX
Explanations
pronouns and their associated subjects in sentences
il followed by verb or pronoun
New Auto-Interp
Negative Logits
plancher
-0.42
hvid
-0.41
vœ
-0.41
chêne
-0.40
Ouest
-0.40
ंदु
-0.40
fæ
-0.37
Ouest
-0.37
miroir
-0.36
sourire
-0.36
POSITIVE LOGITS
It
0.82
it
0.80
they
0.79
It
0.76
they
0.75
there
0.75
They
0.75
THEY
0.71
They
0.68
There
0.67
Activations Density 0.005%