INDEX
Explanations
phrases related to social and political topics
terminology related to social welfare and political activities
New Auto-Interp
Negative Logits
xual
-0.76
mble
-0.66
pta
-0.63
proble
-0.60
together
-0.59
curv
-0.58
rul
-0.54
merce
-0.54
tiss
-0.54
inho
-0.53
POSITIVE LOGITS
sylv
0.59
PAC
0.58
Cruz
0.57
Washington
0.56
Tillerson
0.55
ALEC
0.55
Conway
0.54
ashington
0.54
Ankara
0.53
cipled
0.53
Activations Density 1.282%