INDEX
Explanations
numerical values and dates within historical contexts
invented in the 20th century
New Auto-Interp
Negative Logits
-0.57
(
-0.54
↵
-0.49
-0.49
↵↵
-0.46
-0.44
The
-0.43
<b>
-0.42
(
-0.40
_
-0.40
POSITIVE LOGITS
EconPapers
1.05
<unused74>
1.05
<unused14>
1.05
[@BOS@]
1.04
<unused43>
1.04
<unused28>
1.04
<unused42>
1.04
<unused41>
1.04
<unused16>
1.04
<unused23>
1.04
Activations Density 0.068%