INDEX
Explanations
numerical data and statistics related to research and medical studies
New Auto-Interp
Negative Logits
243
-0.23
235
-0.22
217
-0.21
286
-0.21
285
-0.21
263
-0.21
171
-0.21
176
-0.21
276
-0.21
269
-0.21
POSITIVE LOGITS
600
0.29
620
0.28
550
0.28
650
0.27
630
0.25
610
0.25
590
0.25
560
0.23
700
0.23
680
0.23
Activations Density 0.136%