INDEX
Explanations
references to political figures and their actions
New Auto-Interp
Negative Logits
elo
-0.16
ellers
-0.15
plet
-0.15
%p
-0.15
fois
-0.15
Looper
-0.14
orent
-0.14
å¡ļ
-0.14
дина
-0.14
rowsers
-0.14
POSITIVE LOGITS
Delaware
0.21
Beau
0.21
Biden
0.18
Wilmington
0.17
US
0.15
White
0.15
Malone
0.14
malar
0.14
202
0.14
mute
0.14
Activations Density 0.032%