INDEX
    Explanations

    adjectives and verbs related to expressing opinions or attitudes

    terms related to strictness and transparency in decision-making

    New Auto-Interp
    Negative Logits
    ioxide
    -0.74
    brance
    -0.68
    aleb
    -0.66
    anyon
    -0.65
     vanishing
    -0.64
    ruction
    -0.64
     McDonnell
    -0.63
    Ranked
    -0.61
    tein
    -0.61
    ogg
    -0.60
    POSITIVE LOGITS
     enough
    0.81
    ãĥ¼ãĤ¯
    0.76
    enough
    0.74
    ceptive
    0.72
     reacting
    0.71
    looking
    0.71
    ergic
    0.70
     minded
    0.70
    atively
    0.70
     towards
    0.68
    Act Density 0.205%

    No Known Activations