INDEX
    Explanations

    words related to strong physical reactions and movements

    terms related to compulsive behavior and its effects

    New Auto-Interp
    Negative Logits
     bye
    -0.63
    hold
    -0.59
     Fellow
    -0.59
     Cage
    -0.58
    lihood
    -0.57
     Dominion
    -0.57
     Cald
    -0.57
     Amen
    -0.57
     Misty
    -0.57
     PCIe
    -0.55
    POSITIVE LOGITS
    uls
    1.16
    atility
    1.09
    atile
    1.07
    untarily
    0.94
    ules
    0.93
    hip
    0.92
    atives
    0.91
    eni
    0.90
    erker
    0.90
    untary
    0.89
    Act Density 0.008%

    No Known Activations