INDEX
Explanations
various forms of punctuation and discourse markers indicating dialogue or strong emotional expression
New Auto-Interp
Negative Logits
Voor
-0.15
Trim
-0.14
Vor
-0.14
паÑĢа
-0.14
Logical
-0.14
'..
-0.13
tert
-0.13
Mehmet
-0.13
vor
-0.13
Tit
-0.13
POSITIVE LOGITS
thing
0.18
ukan
0.16
Hodg
0.15
erna
0.15
ÂĿ
0.15
etz
0.15
گاÙĨ
0.15
aby
0.15
entence
0.14
åľ°ä¸ĭ
0.14
Activations Density 0.117%