INDEX
    Explanations

    descriptive words related to internal sensations or reactions, often negative in nature

    references to strong emotional or visceral reactions

    New Auto-Interp
    Negative Logits
    hips
    -0.89
    eton
    -0.74
     Stephenson
    -0.72
     Aad
    -0.70
     Chains
    -0.67
    hare
    -0.66
     Izan
    -0.65
    hift
    -0.64
     Naz
    -0.64
    cale
    -0.63
    POSITIVE LOGITS
    ted
    1.39
    ierrez
    1.28
    ting
    1.21
    ters
    1.21
    tering
    1.08
    tered
    1.05
    warts
    0.96
    terson
    0.94
     microbiota
    0.94
    sy
    0.92
    Act Density 0.025%

    No Known Activations