INDEX
    Explanations

    expressions and variations of the word "love."

    New Auto-Interp
    Negative Logits
    Datuak
    -0.99
    )";
    
    -0.67
     decembrie
    -0.67
     noiembrie
    -0.66
    ؤلاء
    -0.66
    tioners
    -0.66
    ciato
    -0.64
     Scissors
    -0.64
    Sinon
    -0.63
     daz
    -0.63
    POSITIVE LOGITS
     love
    1.62
     LOVE
    1.53
     loves
    1.50
     Loves
    1.45
     Love
    1.44
    LOVE
    1.44
    loves
    1.37
    Love
    1.32
    Loves
    1.32
     loving
    1.31
    Act Density 0.046%

    No Known Activations