INDEX
    Explanations

    words related to challenges to the existing system or status quo

    concepts related to societal norms and critiques of status quos

    New Auto-Interp
    Negative Logits
    anmar
    -0.64
    undai
    -0.64
    uddenly
    -0.63
    ETHOD
    -0.63
    inav
    -0.60
     Zup
    -0.59
     Mayhem
    -0.58
    zbek
    -0.57
    PROV
    -0.57
    ometimes
    -0.56
    POSITIVE LOGITS
     of
    0.91
    iest
    0.86
    iness
    0.80
     thereof
    0.78
    lessness
    0.77
     quo
    0.75
    ifice
    0.75
    iveness
    0.72
     fallacy
    0.72
     dimension
    0.69
    Act Density 0.379%

    No Known Activations