INDEX
Explanations
words related to historical figures
references to historical figures and their societal impacts
New Auto-Interp
Negative Logits
deduct
-0.78
PDATE
-0.77
ournal
-0.75
subscript
-0.73
perman
-0.72
tremend
-0.70
ICAN
-0.69
subsid
-0.65
millenn
-0.65
unfavorable
-0.65
POSITIVE LOGITS
Jr
0.94
hurst
0.84
wald
0.81
berger
0.80
bert
0.79
ridge
0.79
tein
0.78
Norton
0.78
berg
0.78
bie
0.77
Activations Density 0.279%