INDEX
Explanations
phrases addressing or referring to the audience directly
references to the audience or supporters
New Auto-Interp
Negative Logits
ipal
-0.82
aughed
-0.76
ĸļ
-0.74
entimes
-0.71
pite
-0.64
ice
-0.64
ariat
-0.61
imately
-0.59
apo
-0.59
uel
-0.58
POSITIVE LOGITS
guys
1.53
tub
1.30
RS
1.25
're
1.09
Tube
0.92
NG
0.88
sir
0.83
yourselves
0.81
'll
0.81
personally
0.80
Activations Density 0.122%