INDEX
    Explanations

    words with negative connotations or describing negative characteristics/actions

    instances of the word "bad" in various contexts

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.91
    raltar
    -0.84
    ensional
    -0.79
    ittees
    -0.78
    conservancy
    -0.77
    aukee
    -0.77
    eters
    -0.76
    olate
    -0.76
    ynthesis
    -0.74
    illation
    -0.74
    POSITIVE LOGITS
    dest
    1.08
    dies
    1.04
    die
    0.98
    gered
    0.92
     karma
    0.91
    ger
    0.89
     luck
    0.86
     manners
    0.80
     publicity
    0.78
    ged
    0.78
    Act Density 0.025%

    No Known Activations