INDEX
Explanations
issues related to personal responsibility and societal critique
New Auto-Interp
Negative Logits
ligiloj
-0.51
олові
-0.50
Datuak
-0.46
なかなか
-0.44
angusti
-0.44
modelAndView
-0.43
gynhyrchwyd
-0.42
LabelTagHelper
-0.42
morgon
-0.42
ostavi
-0.42
POSITIVE LOGITS
morons
0.92
hypocritical
0.88
dumbass
0.84
idiotic
0.83
fucking
0.83
hypocrisy
0.81
lmfao
0.80
lmao
0.80
idiots
0.80
ignorant
0.78
Activations Density 1.545%