INDEX
Explanations
mentions of physical or emotional pain
references to pain and artistic expression
New Auto-Interp
Negative Logits
ADS
-0.73
nanop
-0.68
indal
-0.68
ahime
-0.67
deviations
-0.66
ource
-0.66
avorite
-0.66
VERTISEMENT
-0.65
wagon
-0.65
lycer
-0.61
POSITIVE LOGITS
staking
1.23
pain
1.10
ting
1.05
esville
1.00
Pain
1.00
ful
0.96
lete
0.93
issance
0.90
Painter
0.86
teness
0.84
Activations Density 0.022%