INDEX
    Explanations

    words associated with deception and isolation

    New Auto-Interp
    Negative Logits
    PDATE
    -0.74
    theless
    -0.73
     Shack
    -0.71
     Dragonbound
    -0.69
     Doctrine
    -0.66
     Belt
    -0.64
     miscarriage
    -0.64
     Penet
    -0.63
     LORD
    -0.62
     Heller
    -0.61
    POSITIVE LOGITS
    ations
    1.94
    ating
    1.81
    ates
    1.73
    ators
    1.70
    atory
    1.56
    ator
    1.55
    ational
    1.51
    ative
    1.43
    ated
    1.43
    ate
    1.34
    Act Density 0.025%

    No Known Activations