INDEX
Explanations
coding syntax and structure indicators
New Auto-Interp
Negative Logits
172
-0.20
Feb
-0.19
Sep
-0.19
173
-0.18
128
-0.18
Kin
-0.18
178
-0.17
177
-0.16
512
-0.16
222
-0.15
POSITIVE LOGITS
0.46
105
0.35
106
0.23
Ľ
0.21
åįģäºĶ
0.20
15
0.20
fifteen
0.20
0.20
0.19
↵ ↵
0.19
Activations Density 0.036%