INDEX
    Explanations

    adjectives related to weakening or reducing something

    words related to reducing or hindering something

    New Auto-Interp
    Negative Logits
     boldly
    -0.68
     prow
    -0.63
     americ
    -0.60
     beware
    -0.59
     wisely
    -0.59
     eyed
    -0.58
     wont
    -0.57
     snail
    -0.57
     Scand
    -0.57
     Brit
    -0.57
    POSITIVE LOGITS
    uate
    0.94
    uates
    0.87
    utive
    0.82
    Increase
    0.81
    activate
    0.79
    chieve
    0.78
    ior
    0.75
    uating
    0.75
    ependent
    0.74
    uce
    0.73
    Act Density 0.091%

    No Known Activations