INDEX
Explanations
frequent references to the second person pronouns "you" and "your."
New Auto-Interp
Negative Logits
Ades
-0.71
abel
-0.67
PED
-0.63
idem
-0.62
ASA
-0.59
verdi
-0.59
#:
-0.59
Oda
-0.58
Coch
-0.58
Kras
-0.57
POSITIVE LOGITS
you
1.69
You
1.64
you
1.60
You
1.58
YOU
1.52
YOU
1.51
we
1.04
Vous
1.01
We
0.98
您
0.97
Activations Density 0.239%