INDEX
Explanations
dates or time periods
references to the term "Early" in historical contexts
New Auto-Interp
Negative Logits
whatsoever
-0.67
racket
-0.67
harmless
-0.67
orcs
-0.65
rael
-0.65
spew
-0.64
conj
-0.61
captured
-0.61
Sakuya
-0.60
alist
-0.60
POSITIVE LOGITS
Early
3.33
Early
3.06
Late
2.15
early
1.95
Late
1.95
early
1.83
Later
1.44
earliest
1.39
Earlier
1.23
late
1.20
Activations Density 0.014%