INDEX
Explanations
patterns related to code blocks or structured formatting in programming
New Auto-Interp
Negative Logits
ſſung
-0.99
нгред
-0.97
ThroughAttribute
-0.97
vooz
-0.96
للاسماء
-0.96
iſchen
-0.94
<unused68>
-0.94
[@BOS@]
-0.94
<unused14>
-0.93
<unused16>
-0.93
POSITIVE LOGITS
1.05
0.87
0.85
0.82
0.81
0.80
0.80
0.79
0.75
0.74
Activations Density 0.117%