INDEX
    Explanations

    phrases related to societal or ethical issues concerning various demographics and practices

    references to animal welfare and the implications of industrial practices

    New Auto-Interp
    Negative Logits
     Tomorrow
    -0.63
    interrupted
    -0.63
    inside
    -0.60
     Together
    -0.59
    acca
    -0.59
    igsaw
    -0.57
    inders
    -0.57
     Inside
    -0.57
    Inside
    -0.57
     Oops
    -0.55
    POSITIVE LOGITS
     usually
    1.72
     rarely
    1.69
     generally
    1.67
     typically
    1.64
     invariably
    1.60
    usually
    1.60
     often
    1.59
     seldom
    1.59
    often
    1.55
     tends
    1.54
    Act Density 0.770%

    No Known Activations