INDEX
Explanations
periods at the end of sentences
New Auto-Interp
Negative Logits
rms
-0.16
d
-0.15
b
-0.14
oord
-0.13
p
-0.13
½æķ°
-0.13
Cass
-0.13
bib
-0.13
[
-0.13
s
-0.13
POSITIVE LOGITS
ÄįnÃŃk
0.17
uchen
0.15
IRCLE
0.15
ERO
0.15
/sdk
0.15
ottes
0.15
Ñģклад
0.14
ÙĥÙĪÙħ
0.14
erdale
0.14
IZE
0.14
Activations Density 0.431%