INDEX
    Explanations

    words related to deception or trickery

    terminology related to deception and being misled

    New Auto-Interp
    Negative Logits
    area
    -0.68
    oran
    -0.65
    Occup
    -0.64
    mun
    -0.61
    Interstitial
    -0.60
    foreseen
    -0.59
    capacity
    -0.59
    empl
    -0.57
    Pain
    -0.55
     grievances
    -0.55
    POSITIVE LOGITS
     deceive
    1.04
     deceived
    1.00
     fooled
    0.99
    ingly
    0.96
     gull
    0.91
    eering
    0.85
    ulent
    0.85
     tricked
    0.84
     confuse
    0.81
     unwitting
    0.81
    Act Density 0.033%

    No Known Activations