INDEX
Explanations
references to political figures and their actions or statements
New Auto-Interp
Negative Logits
owell
-0.16
IGIN
-0.16
zza
-0.15
illo
-0.15
asje
-0.15
_TestCase
-0.14
oomla
-0.14
ÃŃky
-0.14
á»§ng
-0.14
toler
-0.14
POSITIVE LOGITS
Biden
0.27
Delaware
0.26
Beau
0.23
Joe
0.22
Wilmington
0.21
Vice
0.20
VP
0.20
vice
0.20
Hunter
0.20
Scr
0.19
Activations Density 0.019%