INDEX
Explanations
historical dates and numerical references
New Auto-Interp
Negative Logits
197
-0.18
Elev
-0.17
76
-0.17
elson
-0.17
74
-0.17
976
-0.17
78
-0.17
974
-0.16
77
-0.15
uart
-0.15
POSITIVE LOGITS
188
0.31
89
0.31
88
0.29
87
0.28
86
0.26
089
0.23
Wolff
0.23
889
0.22
288
0.22
088
0.22
Activations Density 0.054%