INDEX
Explanations
the presence of the word "of" in various contexts
New Auto-Interp
Negative Logits
ожд
-0.16
ãĥ¼ãĥĪ
-0.15
exactly
-0.14
odo
-0.14
711
-0.14
til
-0.13
oulder
-0.13
imat
-0.13
afi
-0.13
as
-0.13
POSITIVE LOGITS
OOK
0.15
aign
0.14
agher
0.14
nebu
0.14
Aggregate
0.14
eens
0.14
Hook
0.14
Punch
0.13
weep
0.13
lk
0.13
Activations Density 0.154%