INDEX
Explanations
programming-related functions and routines
New Auto-Interp
Negative Logits
78
-0.24
79
-0.24
80
-0.23
77
-0.22
89
-0.21
83
-0.21
85
-0.20
88
-0.20
86
-0.20
87
-0.20
POSITIVE LOGITS
0.28
0.27
0.26
0.26
0.23
0.23
0.20
0.19
0.18
991
0.18
Activations Density 0.008%