INDEX
Explanations
anti-<specific group or issue> sentiments
New Auto-Interp
Negative Logits
ĸļ
-0.74
Sins
-0.68
Pelicans
-0.64
resid
-0.64
ãĤ¼ãĤ¦ãĤ¹
-0.63
Gorge
-0.63
Legends
-0.63
Halls
-0.62
Tycoon
-0.61
Taste
-0.61
POSITIVE LOGITS
abortion
1.21
establishment
1.16
war
1.10
hero
0.97
strate
0.93
overty
0.90
immigrant
0.89
Semitic
0.88
jud
0.87
inflammatory
0.87
Activations Density 0.316%