INDEX
Explanations
punctuation marks and symbols, particularly parentheses and question marks
New Auto-Interp
Negative Logits
ویکیپدیا
-0.72
banc
-0.65
ον
-0.65
виправивши
-0.65
ilit
-0.65
gany
-0.64
squee
-0.63
führt
-0.62
Abit
-0.60
newBuilder
-0.60
POSITIVE LOGITS
(
1.22
(
1.18
:(
1.07
》(
0.99
[
0.97
)(
0.97
{
0.94
!(
0.92
出版年
0.90
”(
0.89
Activations Density 0.058%