INDEX
Negative Logits
.";
-0.93
.";
-0.91
.',
-0.91
.",
-0.91
.';
-0.90
.\\
-0.89
.}}
-0.87
.
-0.86
.');
-0.86
.[/
-0.85
POSITIVE LOGITS
<bos>
0.59
and
0.57
most
0.49
just
0.43
I
0.43
actually
0.43
he
0.42
still
0.42
usually
0.41
always
0.40
Activations Density 0.004%