INDEX
    Explanations

    words related to swindling or dishonesty

    New Auto-Interp
    Negative Logits
    pora
    -0.69
     negligent
    -0.67
    cised
    -0.66
    onomy
    -0.65
    _-
    -0.64
     resting
    -0.62
     degrade
    -0.61
     flawed
    -0.60
     Engel
    -0.59
     Kubrick
    -0.59
    POSITIVE LOGITS
    indle
    1.13
    imming
    1.12
    immers
    1.10
    anky
    1.08
    itched
    1.06
    addle
    1.05
    arf
    1.03
    itcher
    1.02
    inging
    1.02
    itching
    1.01
    Act Density 0.012%

    No Known Activations