INDEX
Explanations
years signaling specific dates or periods in history
references to specific years in the 1800s
New Auto-Interp
Negative Logits
acci
-0.91
paralle
-0.78
colo
-0.78
enegger
-0.76
atri
-0.75
notation
-0.74
glas
-0.73
alez
-0.73
ringe
-0.71
algia
-0.71
POSITIVE LOGITS
sie
0.85
sburg
0.79
hrs
0.75
1861
0.73
1862
0.71
1850
0.69
Hots
0.67
1860
0.67
ãĤ¼ãĤ¦ãĤ¹
0.65
1865
0.64
Activations Density 0.036%