INDEX
Explanations
symbols, punctuation, and formatting indicators commonly used in dialogue or text representations
dialogue markers and names
New Auto-Interp
Negative Logits
ագրություններ
-0.67
مشين
-0.61
<<<<<<<<<<<<<<
-0.59
للمعارف
-0.58
存于互联网档案馆
-0.53
porn
-0.49
rungsseite
-0.48
porn
-0.46
Porn
-0.45
noqa
-0.45
POSITIVE LOGITS
EDEFAULT
0.52
libremente
0.47
contextLoads
0.47
gynhyrchwyd
0.43
MigrationBuilder
0.40
Comprometido
0.40
yoksa
0.38
Wicidata
0.37
țional
0.36
librement
0.36
Activations Density 0.136%