INDEX
Explanations
references to political scandals and resignations
New Auto-Interp
Negative Logits
antine
-0.16
jylland
-0.16
stants
-0.16
artner
-0.15
ancellable
-0.15
Ïĥη
-0.15
_interfaces
-0.15
é¼
-0.15
dad
-0.14
.cancel
-0.14
POSITIVE LOGITS
resignation
0.18
resign
0.18
ains
0.16
EC
0.16
Eg
0.14
oj
0.14
hum
0.14
Ngh
0.14
idge
0.14
562
0.14
Activations Density 0.041%