INDEX
Explanations
instances of numerical information and data references
New Auto-Interp
Negative Logits
[…]
-1.84
-1.55
[…]
-1.52
–
-1.24
……
-1.23
=
-1.23
[...]
-1.15
[...]
-1.15
[`
-1.13
…..
-1.12
POSITIVE LOGITS
^
1.46
^
1.41
■
1.32
'^
1.31
-^
1.26
^'
1.22
«
1.21
■
1.20
»
1.18
.^
1.17
Activations Density 1.206%