INDEX
Explanations
negative words related to behavior or attitudes such as insanity, ignorance, arrogance, and hypocrisy
themes related to mental health issues and societal ignorance
New Auto-Interp
Negative Logits
amins
-0.86
icles
-0.82
icle
-0.81
Interstitial
-0.80
ergy
-0.75
ramer
-0.73
ta
-0.73
umm
-0.73
char
-0.71
arers
-0.71
POSITIVE LOGITS
prejudice
1.03
intolerance
0.91
lessness
0.89
yip
0.88
xtap
0.88
fulness
0.87
stupidity
0.84
ignorance
0.84
malice
0.82
ophobia
0.80
Activations Density 0.051%