INDEX
Explanations
words describing emotional and physical experiences
New Auto-Interp
Negative Logits
Efq
-1.24
Monfieur
-1.13
pleaſure
-1.13
houſe
-1.13
Eſ
-1.10
purpoſe
-1.09
myſelf
-1.05
itſelf
-1.04
himſelf
-1.02
vellous
-0.97
POSITIVE LOGITS
<bos>
0.68
,
0.61
-
0.53
ộn
0.52
L
0.51
<i>
0.47
.
0.47
can
0.46
a
0.45
--
0.44
Activations Density 1.229%