INDEX
Explanations
phrases related to political figures and events
New Auto-Interp
Negative Logits
buster
-0.71
ante
-0.66
Accessed
-0.65
nam
-0.62
venant
-0.62
Donation
-0.62
cheon
-0.61
breaks
-0.59
querque
-0.59
articulated
-0.58
POSITIVE LOGITS
us
1.11
passers
1.03
superiors
1.00
him
0.98
listeners
0.98
outsiders
0.95
viewers
0.95
me
0.95
everyone
0.92
readers
0.91
Activations Density 1.342%