INDEX
Explanations
instances of the word 'some.'
New Auto-Interp
Negative Logits
swer
-0.18
uers
-0.16
someone
-0.16
umat
-0.15
اتÛĮ
-0.15
something
-0.15
lif
-0.15
ãĥªãĥ¼ãĤº
-0.14
esi
-0.14
_some
-0.14
POSITIVE LOGITS
ones
0.36
place
0.33
/all
0.32
kind
0.25
sort
0.24
hw
0.24
ht
0.23
ONES
0.22
-times
0.22
of
0.22
Activations Density 0.098%