INDEX
Explanations
results, benefits, impact, consequences
New Auto-Interp
Negative Logits
ä
0.37
ning
0.31
an
0.30
\
0.30
I
0.29
are
0.26
0.26
bank
0.26
ö
0.26
lain
0.25
POSITIVE LOGITS
al
0.39
at
0.36
ad
0.35
ла
0.35
ன்
0.33
ות
0.33
ar
0.32
in
0.31
ட
0.29
Aplic
0.29
Activations Density 0.716%