INDEX
Explanations
expressions related to receiving or demanding attention or action from others
references to the collective pronoun "us."
New Auto-Interp
Negative Logits
fect
-0.62
Reilly
-0.62
aryl
-0.59
Levine
-0.58
stick
-0.57
Offic
-0.56
tein
-0.55
ME
-0.55
inois
-0.54
endra
-0.54
POSITIVE LOGITS
selves
1.21
ourselves
1.11
hers
1.06
selves
0.94
aning
0.86
ury
0.85
urious
0.84
urers
0.82
ern
0.82
mortals
0.80
Activations Density 0.085%