INDEX
Explanations
content related to news articles, including mentions of specific individuals, political statements, or details about various incidents
New Auto-Interp
Negative Logits
riot
-0.34
riots
-0.33
rend
-0.32
agonal
-0.32
downed
-0.31
traitor
-0.31
lag
-0.31
crunch
-0.30
enemy
-0.30
kHz
-0.29
POSITIVE LOGITS
$$$$
0.36
Flowers
0.34
SCP
0.33
Username
0.33
Airl
0.33
NR
0.33
Etsy
0.33
imming
0.32
Fa
0.32
livious
0.31
Activations Density 0.147%