INDEX
Explanations
instances of the phrase "I do" in various contexts
New Auto-Interp
Negative Logits
b
-0.18
اÛĮÙĩ
-0.18
doing
-0.17
ness
-0.17
noon
-0.17
nt
-0.17
ature
-0.16
pher
-0.16
gr
-0.16
sterol
-0.15
POSITIVE LOGITS
zed
0.22
ÂŃing
0.21
zen
0.21
ings
0.21
(es
0.21
xor
0.20
yles
0.20
able
0.20
ctr
0.20
led
0.19
Activations Density 0.045%