INDEX
Explanations
references to significant actions and positions related to social issues and policies
New Auto-Interp
Negative Logits
challeng
-0.61
destro
-0.58
Vaugh
-0.57
assum
-0.55
withd
-0.55
thous
-0.53
cryst
-0.52
[*
-0.51
tradem
-0.51
proble
-0.51
POSITIVE LOGITS
cigarettes
0.54
ratom
0.48
Belfast
0.47
cigarette
0.46
rock
0.46
hra
0.45
aus
0.45
wolves
0.44
hillary
0.43
Corbyn
0.43
Activations Density 1.155%