INDEX
Explanations
phrases related to political discourse
New Auto-Interp
Negative Logits
GOODMAN
-0.73
adm
-0.70
diplom
-0.70
Aval
-0.67
Rebels
-0.65
Lauder
-0.64
ģĸ
-0.63
compe
-0.61
Exodus
-0.61
susp
-0.60
POSITIVE LOGITS
ossible
1.31
redict
1.31
aired
1.29
ivot
1.25
ierce
1.24
ardon
1.23
ixels
1.20
oultry
1.20
ulse
1.19
uppet
1.16
Activations Density 0.345%