INDEX
    Explanations

    phrases indicating collective benefit and altruism

    New Auto-Interp
    Negative Logits
    ordes
    -0.15
    deniz
    -0.15
    insky
    -0.15
    CTS
    -0.14
    /mainwindow
    -0.14
    åł¡
    -0.14
    _busy
    -0.14
    浪
    -0.13
    é¡į
    -0.13
    639
    -0.13
    POSITIVE LOGITS
     benefit
    0.37
     good
    0.36
     common
    0.35
     greater
    0.32
    good
    0.30
     Good
    0.30
     Benefit
    0.30
    common
    0.28
    -good
    0.28
    /common
    0.28
    Act Density 0.098%

    No Known Activations