INDEX
Explanations
the overall effectiveness or summary results in a given context
New Auto-Interp
Negative Logits
er
-0.75
en
-0.71
bø
-0.66
u
-0.66
o
-0.65
an
-0.62
in
-0.60
ik
-0.60
y
-0.60
io
-0.59
POSITIVE LOGITS
OVERALL
1.94
overall
1.87
overall
1.83
Overall
1.80
Overall
1.79
overal
1.32
overalls
1.17
itſelf
1.16
Insgesamt
1.11
geral
1.10
Activations Density 0.093%