INDEX
Explanations
references to political ideals and democracy
New Auto-Interp
Negative Logits
themſelves
-0.95
himſelf
-0.85
poffe
-0.84
itſelf
-0.81
Jefus
-0.81
myſelf
-0.80
edelstahl
-0.80
pleaſure
-0.78
Monfieur
-0.77
ſmall
-0.76
POSITIVE LOGITS
political
0.84
global
0.81
political
0.67
Political
0.63
global
0.60
Global
0.59
poli
0.58
POLITICAL
0.58
Political
0.57
mergeFrom
0.57
Activations Density 0.504%