INDEX
Explanations
occurrences of the word "of" in various contexts
New Auto-Interp
Negative Logits
mÃŃ
-0.15
aco
-0.15
anka
-0.15
isnan
-0.14
ship
-0.14
cole
-0.14
mers
-0.14
uell
-0.14
apore
-0.14
ewan
-0.14
POSITIVE LOGITS
ynos
0.17
bens
0.16
CTX
0.16
lify
0.15
krom
0.15
emoji
0.15
pu
0.15
Harold
0.14
Wich
0.14
ëĬ
0.14
Activations Density 0.012%