INDEX
    Explanations

    actions related to providing help or support to others

    New Auto-Interp
    Negative Logits
    bang
    -0.66
    mini
    -0.63
    ndra
    -0.62
    ortex
    -0.62
    ppings
    -0.62
    ãĤ¦ãĤ¹
    -0.61
    attribute
    -0.61
    ãĥ¼ãĥ«
    -0.59
    nova
    -0.59
    iannopoulos
    -0.58
    POSITIVE LOGITS
     with
    0.74
     efforts
    0.71
     in
    0.68
     landowners
    0.65
     financially
    0.65
    umsy
    0.64
     technicians
    0.64
    ieth
    0.63
     us
    0.62
     digestion
    0.62
    Act Density 0.101%

    No Known Activations