INDEX
Explanations
references to historical figures, specifically Abraham Lincoln
New Auto-Interp
Negative Logits
ttes
-0.81
nces
-0.79
essee
-0.78
prus
-0.74
agra
-0.70
hift
-0.66
TOP
-0.66
Flavoring
-0.66
Downloadha
-0.64
Sahara
-0.63
POSITIVE LOGITS
raham
1.10
son
1.04
Lincoln
0.99
sen
0.91
sson
0.85
sburg
0.84
antine
0.81
shire
0.81
ode
0.80
Abram
0.79
Activations Density 0.027%