INDEX
    Explanations

    names of famous individuals or entities

    references to something being famous

    New Auto-Interp
    Negative Logits
    otent
    -0.71
    avery
    -0.68
     Bots
    -0.67
    ifle
    -0.67
    heed
    -0.66
    vae
    -0.65
    alone
    -0.65
    cise
    -0.65
    adies
    -0.65
    THER
    -0.63
    POSITIVE LOGITS
     famous
    1.04
    rities
    1.01
    famous
    0.95
     Famous
    0.82
     nickname
    0.80
     headlines
    0.80
     infamous
    0.78
    ness
    0.74
     renown
    0.74
    NESS
    0.73
    Act Density 0.009%

    No Known Activations