INDEX
Explanations
references to quantities and articles in various contexts
New Auto-Interp
Negative Logits
Monfieur
-1.07
Theſe
-1.05
Efq
-1.03
unknownFields
-0.94
iconTwitter
-0.91
autorytatywna
-0.91
PreferredItem
-0.91
ftagPool
-0.89
Majefty
-0.89
disambiguazione
-0.87
POSITIVE LOGITS
a
0.67
“
0.51
unique
0.51
‘
0.51
an
0.50
<eos>
0.47
A
0.46
уника
0.42
B
0.41
our
0.41
Activations Density 0.336%