INDEX
    Explanations

    words related to positive attributes or actions, specifically focusing on "good"

    phrases emphasizing the concept of "good."

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.76
    olate
    -0.76
    eteria
    -0.73
    lets
    -0.69
    kson
    -0.68
     Sturgeon
    -0.67
    eters
    -0.67
    apse
    -0.66
    otom
    -0.66
    ulous
    -0.65
    POSITIVE LOGITS
     intentions
    1.26
     deeds
    1.19
     deed
    1.14
     Samar
    1.10
     ol
    1.05
     luck
    1.00
    reads
    1.00
     manners
    0.95
     fortune
    0.93
    die
    0.93
    Act Density 0.084%

    No Known Activations