INDEX
Explanations
punctuation marks indicating substantial breaks or separations in text
New Auto-Interp
Negative Logits
addy
-0.17
eten
-0.16
eto
-0.15
avo
-0.15
áz
-0.14
楽
-0.14
erez
-0.14
æĬ
-0.14
vinces
-0.14
enth
-0.14
POSITIVE LOGITS
onaut
0.15
iage
0.15
ye
0.15
oret
0.14
è¡Ĺéģĵ
0.14
gua
0.13
154
0.13
[mid
0.13
however
0.13
bat
0.13
Activations Density 0.165%