INDEX
    Explanations

    comparative phrases and comparisons

    comparisons about social justice or inequality

    New Auto-Interp
    Negative Logits
     pole
    -0.64
     Summit
    -0.64
    trop
    -0.62
    agra
    -0.61
    HEAD
    -0.60
    atha
    -0.60
    PHOTOS
    -0.59
    estone
    -0.59
     ABE
    -0.59
    minist
    -0.58
    POSITIVE LOGITS
     it
    0.69
     you
    0.68
     experien
    0.65
    eem
    0.65
     they
    0.63
     he
    0.63
     Malk
    0.61
    apan
    0.60
    him
    0.60
    hov
    0.60
    Act Density 0.325%

    No Known Activations