INDEX
Explanations
pronouns that indicate possession or ownership
possessive pronouns
New Auto-Interp
Negative Logits
-0.47
,
-0.42
The
-0.42
.
-0.41
;
-0.38
[
-0.37
//
-0.36
today
-0.35
(
-0.35
try
-0.34
POSITIVE LOGITS
rungsseite
0.97
ſſung
0.97
<unused16>
0.96
<unused3>
0.96
<unused8>
0.96
<unused42>
0.96
<unused74>
0.96
<unused41>
0.96
[@BOS@]
0.96
<unused17>
0.96
Activations Density 0.084%