INDEX
Explanations
references to specific events or organized activities
dates and numbers
New Auto-Interp
Negative Logits
hit
-0.28
P
-0.28
ex
-0.28
sha
-0.26
Surprisingly
-0.25
ま
-0.25
staggering
-0.25
Surprisingly
-0.25
So
-0.24
ME
-0.24
POSITIVE LOGITS
<unused8>
0.92
[@BOS@]
0.92
<unused14>
0.92
rungsseite
0.92
<unused41>
0.91
<unused28>
0.91
<unused43>
0.91
<unused51>
0.91
<pad>
0.91
<unused3>
0.91
Activations Density 0.017%