INDEX
Explanations
words related to a sense of inability, vulnerability, or powerlessness
terms related to feelings of helplessness or powerlessness
New Auto-Interp
Negative Logits
issue
-0.85
anners
-0.80
andals
-0.75
pheus
-0.71
ATT
-0.71
aternity
-0.71
akes
-0.70
occ
-0.70
ickr
-0.70
paying
-0.70
POSITIVE LOGITS
helpless
1.32
ness
0.99
nesses
0.96
NESS
0.95
powerless
0.88
ingly
0.82
pige
0.81
pless
0.81
redes
0.79
strugg
0.79
Activations Density 0.012%