INDEX
Explanations
references to numerical data or specifications
New Auto-Interp
Negative Logits
AUD
-0.76
Lud
-0.74
Vag
-0.70
Gleaming
-0.69
ter
-0.68
Guest
-0.67
wedge
-0.67
Gentleman
-0.66
Spectre
-0.65
ghost
-0.65
POSITIVE LOGITS
9
1.49
9
1.30
nine
1.03
09
1.03
909
0.99
911
0.96
Nine
0.93
09
0.90
nine
0.90
99
0.89
Activations Density 0.030%