INDEX
Explanations
winning or scoring
words after punctuation
New Auto-Interp
Negative Logits
基づ
0.51
এরা
0.49
пу
0.47
véritables
0.47
τους
0.47
ور
0.47
データを
0.46
га
0.46
fords
0.46
څرنګوالی
0.45
POSITIVE LOGITS
of
0.79
in
0.71
A
0.69
a
0.68
\
0.63
it
0.63
Presidente
0.62
$
0.59
ina
0.59
(
0.55
Activations Density 0.036%