INDEX
Explanations
phrases related to political discourse and argumentation
phrases related to critique or criticism of societal issues
New Auto-Interp
Negative Logits
ãĥĺãĥ©
-0.84
elta
-0.80
SEA
-0.79
significant
-0.78
uilt
-0.74
emale
-0.74
ousands
-0.73
inia
-0.72
OTOS
-0.71
ĻĤ
-0.70
POSITIVE LOGITS
nonsense
0.97
spew
0.90
bullshit
0.87
antics
0.87
excuse
0.87
brav
0.87
shenanigans
0.86
tactics
0.84
clich
0.80
arrogance
0.80
Activations Density 0.312%