INDEX
Explanations
references to political figures
conversational elements and expressions of consideration or preference
New Auto-Interp
Negative Logits
.)
-0.69
AFP
-0.66
damning
-0.64
Shutterstock
-0.64
awaits
-0.63
MSM
-0.61
Poverty
-0.61
powerless
-0.61
ACLU
-0.60
centuries
-0.60
POSITIVE LOGITS
laughs
1.08
Laughs
1.00
initely
0.94
['
0.92
entimes
0.88
yss
0.87
[
0.83
laughter
0.81
ovie
0.79
icularly
0.78
Activations Density 0.659%