INDEX
Explanations
punctuation marks and discourse markers
New Auto-Interp
Negative Logits
iais
-0.17
sing
-0.15
ép
-0.15
å°ı说
-0.14
nét
-0.14
иÑİ
-0.14
é
-0.14
ouro
-0.14
à¸ķà¸Ńà¸Ļ
-0.14
_COMPILE
-0.14
POSITIVE LOGITS
âĶIJ
0.19
ÂĢÂĻ
0.18
¦
0.17
âķĹ
0.17
ees
0.17
Shea
0.17
/'
0.15
nat
0.15
ullivan
0.14
ration
0.14
Activations Density 0.033%