INDEX
    Explanations

    Twitter posts containing images

    instances of the word "pic" indicating pictures or images

    New Auto-Interp
    Negative Logits
     Leilan
    -0.76
     planner
    -0.69
     hindsight
    -0.69
     footing
    -0.67
     delegation
    -0.66
    theless
    -0.66
     nomine
    -0.65
    NetMessage
    -0.63
     nineteen
    -0.62
     segregation
    -0.61
    POSITIVE LOGITS
    twitter
    0.95
     snapped
    0.83
    ://
    0.80
    colo
    0.79
    TED
    0.78
    youtu
    0.74
    Snap
    0.72
     Tweet
    0.71
    ares
    0.71
     img
    0.70
    Act Density 0.013%

    No Known Activations