INDEX
Explanations
names of political figures or people involved in scandals
mentions of specific individuals, particularly Abbas and Abram
New Auto-Interp
Negative Logits
nces
-0.91
omore
-0.86
ndra
-0.68
Doom
-0.67
IGH
-0.66
Savior
-0.65
bones
-0.65
OMG
-0.64
Flavoring
-0.62
gow
-0.62
POSITIVE LOGITS
raham
1.01
ilities
0.85
ortion
0.83
ãĥĩ
0.80
aic
0.79
osta
0.78
bey
0.77
pour
0.77
rod
0.77
bed
0.76
Activations Density 0.015%