INDEX
Explanations
instances of the word "ou" in various contexts
New Auto-Interp
Negative Logits
asca
-0.18
quet
-0.18
nds
-0.16
mites
-0.15
cko
-0.15
ctal
-0.15
udic
-0.15
cken
-0.15
pane
-0.14
enery
-0.14
POSITIVE LOGITS
standing
0.18
812
0.18
tro
0.17
vrir
0.17
vr
0.17
trace
0.17
skirts
0.17
ropa
0.16
erture
0.16
ltr
0.16
Activations Density 0.008%