INDEX
Explanations
number and unit specifications
New Auto-Interp
Negative Logits
es
0.95
ad
0.93
ב
0.88
at
0.84
to
0.82
in
0.80
is
0.79
ס
0.79
en
0.78
of
0.77
POSITIVE LOGITS
risposta
0.71
esimerk
0.67
hindrance
0.66
preparación
0.65
regimen
0.65
conexión
0.65
flowchart
0.64
ման
0.64
destrucción
0.64
он
0.63
Activations Density 0.123%