INDEX
Explanations
highly cited or acknowledged numerical statistics in a context
New Auto-Interp
Negative Logits
302
-0.18
088
-0.17
Schmidt
-0.17
806
-0.17
904
-0.15
298
-0.15
602
-0.15
804
-0.15
024
-0.15
704
-0.15
POSITIVE LOGITS
100
0.28
200
0.27
900
0.23
400
0.21
130
0.21
700
0.21
70
0.21
300
0.20
140
0.19
90
0.19
Activations Density 0.061%