INDEX
Explanations
1 followed by units or numbers
New Auto-Interp
Negative Logits
an
0.48
be
0.46
}
0.45
a
0.44
ak
0.40
nytt
0.40
s
0.38
he
0.38
is
0.38
G
0.38
POSITIVE LOGITS
.
0.47
ста
0.39
출
0.39
д
0.39
<0x0D>
0.38
1
0.38
د
0.38
1
0.37
.*
0.37
ot
0.37
Activations Density 1.565%