INDEX
    Explanations

    expressions highlighting the concept of love and relationships

    New Auto-Interp
    Negative Logits
    Datuak
    -1.13
     decembrie
    -0.76
    ؤلاء
    -0.76
     Пусть
    -0.68
     noiembrie
    -0.66
     Wheeler
    -0.65
    tioners
    -0.64
     ویکی‌پدیای
    -0.63
     daz
    -0.63
     };
    
    -0.62
    POSITIVE LOGITS
     love
    1.80
     LOVE
    1.69
    LOVE
    1.61
     Love
    1.58
    Love
    1.50
     loves
    1.49
    love
    1.48
    loves
    1.41
     Loves
    1.41
     loving
    1.38
    Act Density 0.043%

    No Known Activations