INDEX
Explanations
references to numbers, particularly related to quantity or lists
New Auto-Interp
Negative Logits
3
-0.76
4
-0.75
5
-0.72
7
-0.71
6
-0.69
2
-0.69
0
-0.68
9
-0.68
8
-0.67
1
-0.64
POSITIVE LOGITS
aarrggbb
1.05
huit
1.05
Према
1.01
four
1.00
eight
0.98
Four
0.98
seven
0.96
eight
0.95
five
0.95
seven
0.95
Activations Density 0.121%