INDEX
    Explanations

    expressions related to personal beliefs or political statements

    expressions of sincerity and personal commitment

    New Auto-Interp
    Negative Logits
    wolves
    -0.64
    Patch
    -0.63
     themselves
    -0.62
    astical
    -0.59
     pricey
    -0.58
     pesky
    -0.56
    izens
    -0.56
     Pair
    -0.55
    Textures
    -0.55
     mutants
    -0.55
    POSITIVE LOGITS
     myself
    1.60
     my
    1.20
     personally
    0.95
    My
    0.79
     privileged
    0.79
     unres
    0.78
     MY
    0.77
     My
    0.76
     am
    0.76
     hereby
    0.74
    Act Density 0.699%

    No Known Activations