INDEX
Explanations
key political figures and related discussions
New Auto-Interp
Negative Logits
boss
-0.14
oki
-0.13
iben
-0.13
dispatch
-0.13
ibe
-0.13
builtin
-0.13
akk
-0.12
alyze
-0.12
ilet
-0.12
intree
-0.12
POSITIVE LOGITS
jem
0.15
UDO
0.14
unnecessarily
0.14
LineColor
0.14
retweeted
0.13
Flake
0.13
lash
0.13
urator
0.13
etxt
0.13
guar
0.13
Activations Density 0.821%