INDEX
    Explanations

    words related to surrendering or giving in

    past tense verbs and actions related to causing harm or impact

    New Auto-Interp
    Negative Logits
    vas
    -0.66
    mania
    -0.65
    gram
    -0.65
    find
    -0.65
    spe
    -0.64
    fam
    -0.63
    grad
    -0.62
    von
    -0.61
    aceae
    -0.61
     Kills
    -0.61
    POSITIVE LOGITS
    oots
    0.78
     toe
    0.70
     ears
    0.69
    kered
    0.68
     Wrath
    0.67
    cffffcc
    0.66
    itors
    0.65
     goodbye
    0.64
     tails
    0.64
    pring
    0.64
    Act Density 0.058%

    No Known Activations