INDEX
    Explanations

    words related to facial expressions or emotions

    words and phrases related to physical body parts or actions

    New Auto-Interp
    Negative Logits
    OTOS
    -0.81
    IRE
    -0.63
    LV
    -0.60
    aurus
    -0.60
    NL
    -0.59
     Jav
    -0.58
    Games
    -0.57
    alogue
    -0.57
    odium
    -0.56
    RIP
    -0.55
    POSITIVE LOGITS
    ingly
    0.89
    puff
    0.83
    buck
    0.83
    abouts
    0.81
    legged
    0.77
    yip
    0.76
    ing
    0.73
    edin
    0.73
     toe
    0.72
    footed
    0.71
    Act Density 0.118%

    No Known Activations