INDEX
    Explanations

    expressions of affection or love

    expressing positive sentiment

    New Auto-Interp
    Negative Logits
     Cigar
    -0.51
     portico
    -0.50
     Quay
    -0.49
     Usher
    -0.49
    CGContext
    -0.49
     Chanti
    -0.47
     Somerset
    -0.47
     PCI
    -0.47
     BCA
    -0.46
    tiac
    -0.46
    POSITIVE LOGITS
     love
    1.71
     LOVE
    1.59
    Love
    1.54
    love
    1.50
     Love
    1.47
    LOVE
    1.45
     loved
    1.30
     loves
    1.29
     Loves
    1.24
    loves
    1.24
    Act Density 0.055%

    No Known Activations