INDEX
Explanations
expressions related to political controversies
New Auto-Interp
Negative Logits
©¶æ¥µ
-0.84
Cot
-0.77
Wol
-0.77
Dres
-0.75
KS
-0.74
Phelps
-0.74
Elves
-0.73
Mend
-0.72
Lauder
-0.72
Frey
-0.70
POSITIVE LOGITS
tenance
1.22
withstanding
1.15
actly
1.13
usterity
1.07
actory
1.07
secut
1.06
ificantly
1.06
acted
1.04
pect
1.02
astic
1.01
Activations Density 1.037%