INDEX
Explanations
proper nouns and specific phrases related to political figures and events
New Auto-Interp
Negative Logits
warts
-0.72
ancock
-0.69
iatus
-0.68
ukemia
-0.67
ufact
-0.66
odiac
-0.66
inois
-0.64
zon
-0.63
leneck
-0.62
tackle
-0.61
POSITIVE LOGITS
same
1.02
hashtag
1.00
restroom
0.95
phrase
0.90
pseudonym
0.89
opportunity
0.89
excuse
0.88
analogy
0.86
tactic
0.84
technique
0.84
Activations Density 0.115%