INDEX
Explanations
specific proper nouns, particularly related to meteorites and medical terms
New Auto-Interp
Negative Logits
ts
-0.30
ta
-0.26
te
-0.24
tn
-0.23
td
-0.23
tem
-0.23
to
-0.23
ÑĤ
-0.23
tf
-0.22
hs
-0.22
POSITIVE LOGITS
(es
0.29
ness
0.24
’
0.23
'
0.23
d
0.21
cribe
0.20
det
0.20
den
0.19
ÙĨاÙħÙĩ
0.19
sing
0.19
Activations Density 1.382%