INDEX
    Explanations

    mentions of the word "Nash" at varying activations

    references to a specific name or entity denoted by variations of the token "ash"

    New Auto-Interp
    Negative Logits
     pregn
    -0.68
    etheless
    -0.67
    STER
    -0.60
    worldly
    -0.60
    sworth
    -0.58
    ster
    -0.58
    oplan
    -0.57
     elig
    -0.56
    eering
    -0.56
     cavity
    -0.56
    POSITIVE LOGITS
    nikov
    1.03
    IELD
    0.97
    ield
    0.91
    anu
    0.89
    imi
    0.87
    adow
    0.84
    rine
    0.84
    IFT
    0.84
    ti
    0.82
    ares
    0.82
    Act Density 0.036%

    No Known Activations