INDEX
    Explanations

    phrases related to giving and community service

    New Auto-Interp
    Negative Logits
    RESS
    -0.15
    uyo
    -0.15
    oss
    -0.15
    omial
    -0.15
     öden
    -0.14
    ainter
    -0.14
    ynet
    -0.14
    ste
    -0.14
    hu
    -0.14
     acc
    -0.14
    POSITIVE LOGITS
     back
    0.28
    back
    0.25
    -back
    0.22
     BACK
    0.22
    äºĪ
    0.22
    backs
    0.20
     zurück
    0.19
     terug
    0.18
    _back
    0.18
     Back
    0.18
    Act Density 0.027%

    No Known Activations