INDEX
Explanations
the word "Great" and words with positive connotations
New Auto-Interp
Negative Logits
-1.23
↵↵
-1.09
The
-1.00
But
-0.97
-0.97
A
-0.96
-0.92
Little
-0.90
No
-0.89
<eos>
-0.89
POSITIVE LOGITS
auffi
1.26
purpoſe
1.08
ſmall
1.04
ſtate
1.01
ſche
1.00
RestTemplate
0.97
faſt
0.97
ſhe
0.97
noft
0.96
myſelf
0.96
Activations Density 1.432%