INDEX
    Explanations

    words relating to strong positive emotions, particularly a high degree of liking or affection

    mentions of fondness or positive feelings

    New Auto-Interp
    Negative Logits
    irrel
    -0.78
    adesh
    -0.74
    UGH
    -0.69
    Tube
    -0.67
    udder
    -0.67
    opers
    -0.65
    ħĭ
    -0.64
    atts
    -0.64
    KT
    -0.64
    IDER
    -0.64
    POSITIVE LOGITS
     fond
    1.20
    ness
    0.94
     memories
    0.88
    uously
    0.87
     farewell
    0.83
     remem
    0.82
    nesses
    0.81
    rait
    0.79
    entimes
    0.79
    ries
    0.76
    Act Density 0.010%

    No Known Activations