INDEX
Explanations
punctuation and periods in the text
New Auto-Interp
Negative Logits
chter
-0.16
icap
-0.16
ispecies
-0.15
’B
-0.15
astery
-0.15
enal
-0.14
.trailing
-0.14
invest
-0.14
akin
-0.14
Hos
-0.14
POSITIVE LOGITS
oya
0.15
abr
0.15
лÑıн
0.15
ÙĬا
0.15
olit
0.15
orado
0.15
omed
0.14
hangi
0.13
DTD
0.13
gra
0.13
Activations Density 0.003%