INDEX
Explanations
instances of the word "the" in various contexts
New Auto-Interp
Negative Logits
actly
-0.16
ther
-0.16
ment
-0.15
ightly
-0.15
sky
-0.15
imbus
-0.15
ns
-0.15
amente
-0.15
icum
-0.14
Fus
-0.14
POSITIVE LOGITS
oretical
0.27
oret
0.18
odore
0.17
orem
0.16
atre
0.16
orical
0.16
/Dk
0.15
ocracy
0.15
issen
0.15
otime
0.15
Activations Density 0.294%