INDEX
Explanations
statements made by individuals in news or interview settings
statements related to political issues and controversies
New Auto-Interp
Negative Logits
aspirin
-0.68
storefront
-0.65
totality
-0.61
cox
-0.60
lantern
-0.60
basil
-0.59
bandits
-0.59
crimson
-0.58
moth
-0.58
goblins
-0.58
POSITIVE LOGITS
Speaking
1.58
Speaking
1.35
Asked
1.24
Interested
1.21
Writing
1.15
Refer
1.07
Talking
1.06
Discuss
1.00
Calling
0.98
Asked
0.98
Activations Density 0.395%