INDEX
    Explanations

    positive attributes or actions related to morality or ethics

    phrases or concepts associated with "good faith" or general goodness

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.80
    eters
    -0.72
    olate
    -0.71
     Sturgeon
    -0.70
    otom
    -0.70
    _>
    -0.68
    EStream
    -0.67
    agos
    -0.67
     Canaver
    -0.66
    hyde
    -0.66
    POSITIVE LOGITS
     intentions
    1.27
     Samar
    1.23
     luck
    1.19
    bye
    1.18
    reads
    1.14
     deeds
    1.13
     deed
    1.12
     ol
    1.10
     fortune
    1.05
    enough
    1.04
    Act Density 0.068%

    No Known Activations