INDEX
Explanations
instances of verbs related to experiencing pleasure or satisfaction
expressions of enjoyment or pleasure
New Auto-Interp
Negative Logits
ural
-0.77
ethical
-0.68
raft
-0.67
defect
-0.64
sil
-0.63
pled
-0.63
bag
-0.61
scrimmage
-0.60
polymer
-0.59
lers
-0.59
POSITIVE LOGITS
nels
0.86
ably
0.86
joy
0.83
lihood
0.83
enjoyment
0.82
OWER
0.80
enjoys
0.78
ĸļ
0.76
enjoying
0.75
enjoyed
0.73
Activations Density 0.030%