INDEX
    Explanations

    instances where someone publicly criticizes or points out something or someone

    phrases emphasizing calling someone out or criticizing actions

    New Auto-Interp
    Negative Logits
     gamble
    -0.67
     princip
    -0.66
    fing
    -0.63
    oreal
    -0.62
    entry
    -0.61
     mint
    -0.61
     assum
    -0.59
    parable
    -0.59
     confir
    -0.58
    ushima
    -0.58
    POSITIVE LOGITS
    stretched
    0.98
     loud
    0.97
    casts
    0.82
     loudly
    0.74
    posts
    0.74
     Sinclair
    0.72
    tical
    0.70
    lier
    0.70
    tics
    0.70
    smart
    0.69
    Act Density 0.018%

    No Known Activations