INDEX
Explanations
references to "the" in various contexts
New Auto-Interp
Negative Logits
mamak
-0.18
eel
-0.17
erule
-0.17
ebek
-0.17
мом
-0.15
nesc
-0.15
emd
-0.15
eyin
-0.14
urtle
-0.14
rex
-0.14
POSITIVE LOGITS
lives
0.27
efforts
0.24
opinions
0.24
actions
0.23
thoughts
0.23
wishes
0.22
minds
0.22
writings
0.22
experiences
0.21
backs
0.21
Activations Density 0.329%