INDEX
    Explanations

    words indicating confusion or being puzzled

    terms related to confusion or puzzlement

    New Auto-Interp
    Negative Logits
    amins
    -0.80
    igers
    -0.72
    roleum
    -0.69
    mens
    -0.63
    llan
    -0.63
    rio
    -0.63
    umption
    -0.63
     credits
    -0.62
    ppo
    -0.62
    ods
    -0.61
    POSITIVE LOGITS
     baff
    1.21
     baffled
    1.15
     perplex
    1.10
     Puzz
    1.04
    ingly
    1.03
     bewild
    0.95
     vex
    0.92
     puzz
    0.91
     puzzled
    0.89
     puzzling
    0.86
    Act Density 0.025%

    No Known Activations