INDEX
Explanations
mentions of governance and political events
New Auto-Interp
Negative Logits
Dinner
-0.16
atern
-0.15
ź
-0.15
izard
-0.15
urum
-0.15
aus
-0.14
anghai
-0.14
hy
-0.14
letters
-0.13
gmail
-0.13
POSITIVE LOGITS
OLLOW
0.17
swick
0.16
ombat
0.16
elu
0.16
anine
0.16
enburg
0.15
ibilit
0.15
dü
0.14
asan
0.14
onet
0.14
Activations Density 0.122%