INDEX
Explanations
instances of the phrase "no pain."
phrases related to the concept of effort and reward
New Auto-Interp
Negative Logits
imating
-0.69
eele
-0.68
avis
-0.67
av
-0.66
challeng
-0.66
col
-0.65
aver
-0.65
vet
-0.64
artney
-0.64
Tip
-0.64
POSITIVE LOGITS
no
1.55
no
1.49
NO
1.32
No
1.28
No
1.22
NO
1.21
none
1.13
nothing
1.03
nobody
1.02
zero
0.99
Activations Density 0.217%