INDEX
    Explanations

    phrases related to community improvement and altruism

    New Auto-Interp
    Negative Logits
    ordes
    -0.16
    CTS
    -0.14
    deniz
    -0.14
    itty
    -0.14
    eniz
    -0.14
    åł¡
    -0.13
    utos
    -0.13
    insky
    -0.13
    /mainwindow
    -0.13
    _busy
    -0.13
    POSITIVE LOGITS
     benefit
    0.40
     common
    0.39
     greater
    0.34
     Benefit
    0.33
    common
    0.31
     Common
    0.30
    /common
    0.30
     good
    0.28
    greater
    0.28
     COMMON
    0.28
    Act Density 0.086%

    No Known Activations