INDEX
Explanations
mentions of the word "twice" in various contexts
New Auto-Interp
Negative Logits
li
-0.16
zug
-0.16
hausen
-0.15
ling
-0.15
name
-0.14
forth
-0.14
owell
-0.14
lier
-0.14
rell
-0.14
laus
-0.14
POSITIVE LOGITS
-thirds
0.23
/th
0.22
dozen
0.20
nd
0.16
ño
0.15
/errors
0.15
gether
0.15
ër
0.15
idlo
0.15
arily
0.15
Activations Density 0.008%