INDEX
Explanations
references to self and others in various contexts
New Auto-Interp
Negative Logits
they
-0.62
we
-0.58
Cer
-0.54
he
-0.53
<h2>
-0.52
published
-0.49
']").
-0.49
viana
-0.48
alyptus
-0.48
untersch
-0.48
POSITIVE LOGITS
OGND
0.90
Him
0.87
Them
0.86
Them
0.85
Myself
0.81
Myself
0.79
Autoritní
0.79
myself
0.78
themselves
0.78
Him
0.77
Activations Density 0.137%