INDEX
    Explanations

    the word "loving" or words associated with love and affection

    references to the word "love."

    New Auto-Interp
    Negative Logits
     Cardinal
    -0.70
     bailout
    -0.69
     testament
    -0.66
     standard
    -0.65
    opsy
    -0.65
    GC
    -0.63
     Racial
    -0.63
     Billboard
    -0.63
     Penny
    -0.63
     Crystal
    -0.62
    POSITIVE LOGITS
    lov
    5.20
    liv
    1.65
    lav
    1.39
    hov
    1.18
    lain
    1.17
    akov
    1.17
    lik
    1.13
    д
    1.11
    lr
    0.98
    ov
    0.98
    Act Density 0.027%

    No Known Activations