INDEX
Explanations
the beginning of sentences, particularly clauses starting with "As"
phrases beginning with "As"
New Auto-Interp
Negative Logits
thro
-0.66
abouts
-0.65
ALLY
-0.63
ãĤī
-0.59
ritic
-0.59
lean
-0.58
whine
-0.58
whatsoever
-0.58
âϦ
-0.57
didn
-0.56
POSITIVE LOGITS
pects
1.27
semb
1.22
ymm
1.15
phalt
1.13
bestos
1.13
piring
1.09
ynchronous
1.04
semble
1.02
piration
1.02
king
0.98
Activations Density 0.066%