INDEX
Explanations
different types or categories of items or concepts
type of thing
New Auto-Interp
Negative Logits
OGND
-0.86
queſto
-0.77
MigrationBuilder
-0.77
[@BOS@]
-0.75
<unused28>
-0.75
<pad>
-0.75
<unused52>
-0.75
<unused74>
-0.75
<unused41>
-0.75
<unused8>
-0.74
POSITIVE LOGITS
stuff
0.33
thing
0.30
cuestión
0.30
K
0.29
Reprodução
0.29
folks
0.28
cosas
0.28
ahí
0.28
chrétienne
0.27
part
0.27
Activations Density 0.237%