INDEX
Explanations
instances of the letter 'D' in various contexts
New Auto-Interp
Negative Logits
omain
-0.21
avers
-0.16
emoc
-0.16
atabase
-0.15
ieu
-0.15
aver
-0.15
onn
-0.15
ourse
-0.14
portun
-0.14
ocker
-0.14
POSITIVE LOGITS
egr
0.24
une
0.20
ichen
0.18
UNE
0.17
_ax
0.17
ora
0.17
ances
0.17
OP
0.17
egas
0.16
ceu
0.16
Activations Density 0.011%