INDEX
Explanations
occurrences of the word "ou."
New Auto-Interp
Negative Logits
ni
-0.19
nels
-0.17
ammers
-0.16
ãĥ³
-0.15
ases
-0.15
nan
-0.15
nu
-0.15
rieving
-0.15
nic
-0.15
nie
-0.15
POSITIVE LOGITS
nger
0.18
illet
0.18
cou
0.17
thern
0.17
ltre
0.17
lt
0.17
verture
0.17
eurs
0.17
ette
0.16
theast
0.16
Activations Density 0.031%