INDEX
    Explanations

    references to the concept of "handedness", specifically the term "handed" with high activation values

    references to handedness, particularly in relation to left-handedness

    New Auto-Interp
    Negative Logits
    ouf
    -0.73
    lé
    -0.70
    LAN
    -0.68
     Delta
    -0.68
    CHAT
    -0.68
     Detect
    -0.68
     Scores
    -0.66
     Eps
    -0.66
    Tonight
    -0.66
    ETF
    -0.66
    POSITIVE LOGITS
    handed
    1.31
     nodd
    1.07
    maid
    0.92
     showc
    0.88
    footed
    0.85
    hander
    0.82
     destro
    0.82
     enthusi
    0.79
     axe
    0.78
    ragon
    0.77
    Act Density 0.007%

    No Known Activations