INDEX
Explanations
negative characterizations or descriptors related to current events, particularly in political contexts
New Auto-Interp
Negative Logits
Effective
-0.78
probable
-0.73
enriched
-0.71
theless
-0.71
Enhanced
-0.71
imately
-0.70
certific
-0.69
authorised
-0.68
individually
-0.68
contracted
-0.68
POSITIVE LOGITS
stros
1.03
aunts
0.89
agonist
0.88
anny
0.85
isms
0.84
unker
0.84
asses
0.83
angles
0.83
itty
0.82
otypes
0.81
Activations Density 0.179%