INDEX
    Explanations

    mentions of spam

    occurrences and discussions of spam

    New Auto-Interp
    Negative Logits
    hani
    -1.12
    IST
    -0.75
     Borders
    -0.67
     Fathers
    -0.66
     Remem
    -0.65
     Cel
    -0.65
     Syri
    -0.64
     Statue
    -0.63
    avery
    -0.63
     Patri
    -0.63
    POSITIVE LOGITS
    ming
    1.25
     spam
    1.02
    inator
    0.92
    ulent
    0.83
    icons
    0.82
    icide
    0.82
    vertising
    0.82
    ular
    0.81
    mers
    0.81
    bag
    0.80
    Act Density 0.005%

    No Known Activations