INDEX
    Explanations

    phrases related to news headlines or bullet points

    punctuation marks and symbols indicating modifications in text format

    New Auto-Interp
    Negative Logits
    umenthal
    -0.68
    pill
    -0.66
    shift
    -0.62
    abal
    -0.58
    pex
    -0.57
    tsky
    -0.57
     Shap
    -0.56
    web
    -0.56
     Vish
    -0.56
    apesh
    -0.56
    POSITIVE LOGITS
    Associated
    0.67
    NOR
    0.66
     sergeant
    0.65
    heny
    0.63
    arro
    0.62
     BALL
    0.61
    riad
    0.60
     STATE
    0.60
    outheast
    0.60
     WRITE
    0.59
    Act Density 0.051%

    No Known Activations