INDEX
    Explanations

    phrases related to dispute or conflict

    phrases containing varying degrees of negativity or strong criticism

    New Auto-Interp
    Negative Logits
     Gaul
    -0.75
     Rudd
    -0.75
     SAM
    -0.73
     Polk
    -0.72
     Nau
    -0.69
     Slug
    -0.69
     Monkey
    -0.67
     Doodle
    -0.66
     Hud
    -0.65
     Filter
    -0.64
    POSITIVE LOGITS
    extremely
    1.15
    responsible
    1.14
    expected
    1.12
    reci
    1.10
    emb
    1.09
    treated
    1.08
    very
    1.07
    sufficient
    1.05
    absolutely
    1.05
    really
    1.05
    Act Density 0.093%

    No Known Activations