INDEX
    Explanations

    sentiments related to altruism and social support

    New Auto-Interp
    Negative Logits
    luv
    -0.15
    ulet
    -0.15
    ãĤ¤ãĥ¤
    -0.15
    amba
    -0.15
    pushViewController
    -0.14
    utron
    -0.14
    ÏĦÏī
    -0.14
    ắn
    -0.13
    еÑģÑĤи
    -0.13
    æ·
    -0.13
    POSITIVE LOGITS
     step
    0.53
     stepped
    0.51
    step
    0.47
     stepping
    0.45
     steps
    0.44
    Step
    0.43
     Step
    0.42
    -step
    0.42
     STEP
    0.39
    .step
    0.39
    Act Density 0.396%

    No Known Activations