INDEX
    Explanations

    phrases related to controversial social or political topics

    phrases about laws or regulations

    New Auto-Interp
    Negative Logits
    ipeg
    -0.68
    itarian
    -0.65
    aval
    -0.65
    isen
    -0.64
    oll
    -0.64
    rend
    -0.61
    atre
    -0.61
    uber
    -0.60
    hatt
    -0.59
    ymph
    -0.59
    POSITIVE LOGITS
     thereby
    1.13
     citing
    1.07
     including
    0.99
     preferring
    0.97
     allowing
    0.94
     noting
    0.93
     opting
    0.93
     excluding
    0.93
     implying
    0.92
     whereby
    0.92
    Act Density 0.381%

    No Known Activations