INDEX
Explanations
instances of the word "unfortunately."
New Auto-Interp
Negative Logits
conserv
-0.15
vim
-0.15
Pu
-0.14
ür
-0.14
cono
-0.14
nte
-0.14
synonym
-0.14
ve
-0.14
und
-0.13
ñana
-0.13
POSITIVE LOGITS
çe
0.17
nop
0.16
ably
0.16
robe
0.15
&action
0.15
uce
0.14
611
0.14
805
0.14
_lift
0.14
ìľ¼
0.14
Activations Density 0.016%