INDEX
Explanations
proper nouns and specific names
appearing at the beginning of words
prefixes of entities with long s spellings
New Auto-Interp
Negative Logits
-0.73
,
-0.66
(
-0.60
base
-0.60
in
-0.59
thenReturn
-0.59
:
-0.56
.
-0.55
of
-0.55
set
-0.54
POSITIVE LOGITS
ſelf
1.10
ſelves
1.02
queſta
0.99
Diſ
0.98
handker
0.98
Anſ
0.98
་་
0.95
Forumite
0.94
ſind
0.93
ſever
0.92
Activations Density 0.545%