INDEX
    Explanations

    phrases related to empathy and support for others

    instances of the word "and," indicating a focus on conjunctions or connections between ideas

    New Auto-Interp
    Negative Logits
    Sat
    -0.76
    agen
    -0.71
    itarian
    -0.68
    lav
    -0.67
    ibi
    -0.67
    Sov
    -0.67
    ignant
    -0.66
    atan
    -0.66
    ASC
    -0.65
    hov
    -0.64
    POSITIVE LOGITS
     hopefully
    1.05
    romeda
    0.96
     thereby
    0.95
     thus
    0.95
     secondly
    0.89
     lifestyles
    0.87
     consequently
    0.87
     enjoy
    0.86
     interacts
    0.85
     hence
    0.84
    Act Density 0.422%

    No Known Activations