INDEX
    Explanations

    the name "Joe" with considerable strength

    mentions of the name "Joe."

    New Auto-Interp
    Negative Logits
    NESS
    -0.92
    hips
    -0.87
     glim
    -0.79
    igated
    -0.73
    igators
    -0.72
    mble
    -0.70
    igator
    -0.69
     narrator
    -0.69
    ample
    -0.69
    raints
    -0.67
    POSITIVE LOGITS
     Biden
    0.99
    pport
    0.90
    ppo
    0.89
     Arpaio
    0.89
    zzi
    0.86
     Rog
    0.81
     Scarborough
    0.80
     Camel
    0.79
    xtap
    0.79
     Russo
    0.79
    Act Density 0.029%

    No Known Activations