INDEX
Explanations
references to political figures and their statements concerning discrimination and freedom of speech
New Auto-Interp
Head Attr Weights
0:0.07
1:0.01
2:0.06
3:0.08
4:0.04
5:0.09
6:0.07
7:0.06
8:0.37
9:0.04
10:0.03
11:0.03
Negative Logits
Quincy
-3.31
Lowell
-3.15
Syracuse
-3.10
acci
-3.00
arnaev
-2.98
Sweeney
-2.97
Erie
-2.93
COMPLE
-2.88
Corvette
-2.86
Rhode
-2.85
POSITIVE LOGITS
Dutch
4.85
Netherlands
4.60
Dutch
4.42
ko
3.91
Ajax
3.85
PV
3.70
Johannes
3.67
Farage
3.60
Danish
3.60
Denmark
3.58
Activations Density 0.003%