INDEX
Explanations
dates and their associated context within the text
New Auto-Interp
Negative Logits
ANTE
-0.16
371
-0.15
ivant
-0.15
alam
-0.15
phas
-0.14
nar
-0.14
bih
-0.14
lorem
-0.14
Butt
-0.14
andi
-0.14
POSITIVE LOGITS
201
0.24
200
0.19
last
0.17
202
0.17
agues
0.14
agoon
0.14
pig
0.14
ownt
0.14
iferay
0.14
letzten
0.14
Activations Density 0.045%