INDEX
    Explanations

    phrases related to caring for and supporting others

    New Auto-Interp
    Negative Logits
    ãĥĥãĥī
    -0.85
    ãĥ³ãĤ¸
    -0.78
    kered
    -0.71
     bluff
    -0.70
    âĸ¬
    -0.69
     rand
    -0.67
     IGN
    -0.64
    âĸ¬âĸ¬
    -0.62
    akedown
    -0.62
    lihood
    -0.61
    POSITIVE LOGITS
    taker
    1.70
    giving
    1.32
    taking
    1.21
    lessness
    1.03
    fully
    1.02
    tta
    1.01
    ening
    0.99
    free
    0.97
    lessly
    0.96
    ful
    0.88
    Act Density 0.026%

    No Known Activations