INDEX
    Explanations

    names of individuals in text

    mentions of specific individuals, likely authority figures or experts, in the context of providing quotes or insights

    New Auto-Interp
    Negative Logits
     revenge
    -0.67
    tumblr
    -0.64
     vigilante
    -0.63
     diaper
    -0.62
     KKK
    -0.62
     racist
    -0.61
    ĸļ
    -0.59
     reincarn
    -0.59
     youtube
    -0.59
    */(
    -0.58
    POSITIVE LOGITS
    endor
    0.74
    ansky
    0.71
    hani
    0.71
    lett
    0.71
    mann
    0.71
    rup
    0.70
    enda
    0.69
    owsky
    0.69
    elli
    0.68
    patrick
    0.68
    Act Density 0.518%

    No Known Activations