INDEX
    Explanations

    phrases that urge people to take action or perform a good deed

    New Auto-Interp
    Negative Logits
    ulous
    -0.16
    837
    -0.15
    nx
    -0.15
    onica
    -0.15
    ATS
    -0.14
     heaven
    -0.14
     Heaven
    -0.14
    finity
    -0.14
    phant
    -0.14
     aspir
    -0.14
    POSITIVE LOGITS
    errupt
    0.16
    BOT
    0.15
    elda
    0.14
    assin
    0.14
    æ¦ľ
    0.14
    oppel
    0.14
    udder
    0.14
    ereg
    0.14
    elez
    0.14
     davran
    0.13
    Act Density 0.209%

    No Known Activations