INDEX
Explanations
words related to physical discomfort or negative experiences
references to headaches and their impact, contrasting them with pleasantries or unpleasant situations
New Auto-Interp
Negative Logits
yson
-0.71
ature
-0.66
Max
-0.66
ized
-0.63
Origin
-0.63
wana
-0.63
reverse
-0.62
Age
-0.61
laws
-0.61
max
-0.61
POSITIVE LOGITS
pleasant
2.22
unpleasant
2.14
headache
1.78
pleasant
1.73
headaches
1.58
painful
1.40
enjoyable
1.27
thorn
1.26
friction
1.24
pleasure
1.14
Activations Density 0.023%