INDEX
    Explanations

    expressions related to emotions or attitudes of sincerity or dedication

    terms related to kindness or warmth versus coldness and cruel behavior

    New Auto-Interp
    Negative Logits
    Downloadha
    -0.83
    ggies
    -0.71
    assic
    -0.66
    ICH
    -0.64
    andra
    -0.64
    ICO
    -0.64
    abases
    -0.63
    MAT
    -0.61
    JO
    -0.60
    Indust
    -0.60
    POSITIVE LOGITS
    hearted
    1.32
    ness
    0.86
    heartedly
    0.82
    tons
    0.74
     endeavour
    0.72
    terness
    0.70
    nesses
    0.69
     glances
    0.68
    acters
    0.68
     altru
    0.68
    Act Density 0.006%

    No Known Activations