INDEX
    Explanations

    words related to positive impact or benefit

    the concept of "good" in various contexts

    New Auto-Interp
    Negative Logits
    eters
    -0.79
    agos
    -0.74
    ptin
    -0.73
    hod
    -0.70
    âĹ¼
    -0.67
     Pavilion
    -0.67
    pper
    -0.67
    kson
    -0.66
    ocene
    -0.66
    gow
    -0.65
    POSITIVE LOGITS
    enough
    1.10
    reads
    1.05
     deed
    0.95
     intentions
    0.94
     deeds
    0.93
     Samar
    0.92
     luck
    0.91
    sword
    0.89
    NESS
    0.81
    luck
    0.81
    Act Density 0.056%

    No Known Activations