INDEX
Explanations
politically charged keywords and names, with a focus on specific individuals and terms related to governmental and political activities
New Auto-Interp
Negative Logits
hyde
-0.77
Annotations
-0.70
dwarves
-0.69
Fargo
-0.68
substitutes
-0.67
learners
-0.65
Rhod
-0.64
constructs
-0.64
portals
-0.62
extracts
-0.62
POSITIVE LOGITS
agging
1.07
acious
1.06
ithering
1.04
assion
1.03
anky
1.02
ounding
1.02
uper
1.00
angu
0.98
attering
0.98
isive
0.98
Activations Density 0.220%