INDEX
Explanations
phrases related to political and social commentary
elements related to social issues and injustices
New Auto-Interp
Negative Logits
yssey
-0.85
initially
-0.84
Initially
-0.81
Factors
-0.78
Initially
-0.75
pects
-0.74
ynchron
-0.73
earchers
-0.73
Adapt
-0.71
Originally
-0.71
POSITIVE LOGITS
whining
1.05
goddamn
1.05
bulldo
1.04
thugs
1.04
idiots
1.02
defund
1.00
bureaucrats
0.99
fucking
0.99
lobbyists
0.98
torches
0.97
Activations Density 0.844%