INDEX
    Explanations

    instances where instructions or actions are being suggested or described

    verbs related to actions that involve engagement or participation

    New Auto-Interp
    Negative Logits
    heim
    -0.73
    nton
    -0.70
    EMS
    -0.66
    lat
    -0.64
     unveiling
    -0.60
    dt
    -0.59
    cerning
    -0.59
    ether
    -0.58
    can
    -0.57
    hester
    -0.57
    POSITIVE LOGITS
    ependent
    0.74
    irection
    0.71
    oaded
    0.67
     starve
    0.67
     something
    0.66
    ivated
    0.66
    redients
    0.66
    utical
    0.64
     hypocr
    0.64
    ivable
    0.64
    Act Density 0.257%

    No Known Activations