INDEX
    Explanations

    words related to challenging or being challenged

    contexts related to challenging authority or established norms

    New Auto-Interp
    Negative Logits
    opter
    -0.68
    ãĥ¼ãĥĨãĤ£
    -0.68
    abet
    -0.65
     istg
    -0.65
    psons
    -0.64
    ppa
    -0.64
    ··
    -0.62
    anuts
    -0.61
    ]}
    -0.61
    around
    -0.61
    POSITIVE LOGITS
     assumptions
    1.04
     incumb
    0.93
     incumbent
    0.92
     precon
    0.90
     stereotypes
    0.89
     perceptions
    0.87
     orthodoxy
    0.86
     misconceptions
    0.84
     boundaries
    0.78
     assertions
    0.78
    Act Density 0.066%

    No Known Activations