INDEX
    Explanations

    names of specific entities or figures

    New Auto-Interp
    Negative Logits
     beware
    -0.77
     patiently
    -0.71
     aloud
    -0.70
     forcefully
    -0.69
     delivered
    -0.69
     partake
    -0.69
    thood
    -0.69
     stopping
    -0.67
     recite
    -0.66
    perse
    -0.66
    POSITIVE LOGITS
    oret
    1.28
    atre
    1.24
    odor
    1.19
     Hague
    1.17
    resa
    1.14
    orem
    1.09
     Economist
    1.06
     Simpsons
    1.06
    odore
    1.03
     Guardian
    1.02
    Act Density 0.463%

    No Known Activations