INDEX
Explanations
sections of text related to scientific evaluation and methodology
New Auto-Interp
Negative Logits
something
-0.74
something
-0.69
lainnya
-0.65
anything
-0.65
anything
-0.64
autre
-0.64
addirittura
-0.63
tudo
-0.63
semuanya
-0.61
muchísimo
-0.61
POSITIVE LOGITS
selected
1.20
ausgewä
0.96
various
0.96
Selected
0.96
selected
0.93
SELECTED
0.92
Selected
0.85
various
0.83
select
0.82
SELECTED
0.81
Activations Density 2.278%