INDEX
    Explanations

    phrases related to societal issues and policies

    New Auto-Interp
    Negative Logits
    ãĥ¥
    -0.66
    UV
    -0.65
     Silence
    -0.65
     Horses
    -0.64
    ¡
    -0.64
    PER
    -0.63
     Shack
    -0.62
     Norton
    -0.62
    LV
    -0.61
    GROUND
    -0.61
    POSITIVE LOGITS
    ividual
    1.15
    istically
    1.02
     identifiable
    0.96
     who
    0.90
    ities
    0.90
    istic
    0.89
    hips
    0.82
    istical
    0.82
    istics
    0.81
     composing
    0.81
    Act Density 0.023%

    No Known Activations