INDEX
    Explanations

    words related to historical figures

    references to historical figures and their societal impacts

    New Auto-Interp
    Negative Logits
     deduct
    -0.78
    PDATE
    -0.77
    ournal
    -0.75
     subscript
    -0.73
     perman
    -0.72
     tremend
    -0.70
    ICAN
    -0.69
     subsid
    -0.65
     millenn
    -0.65
     unfavorable
    -0.65
    POSITIVE LOGITS
     Jr
    0.94
    hurst
    0.84
    wald
    0.81
    berger
    0.80
    bert
    0.79
    ridge
    0.79
    tein
    0.78
     Norton
    0.78
    berg
    0.78
    bie
    0.77
    Act Density 0.279%

    No Known Activations