INDEX
Explanations
foreign language characters and code snippets
New Auto-Interp
Negative Logits
’
1.12
0
0.85
<0x80>
0.83
that
0.82
<
0.77
’:
0.77
’;
0.75
’?
0.72
’).
0.71
’),
0.71
POSITIVE LOGITS
ка
0.80
на
0.71
و
0.69
ك
0.69
скольку
0.68
도
0.66
ре
0.66
Tôi
0.64
Aqui
0.63
तरुणा
0.63
Activations Density 0.480%