INDEX
    Explanations

    instances and expressions of love

    New Auto-Interp
    Negative Logits
    Datuak
    -1.12
     decembrie
    -0.74
    standers
    -0.71
    ؤلاء
    -0.67
     octombrie
    -0.65
    tioners
    -0.64
    barra
    -0.63
     noiembrie
    -0.63
    طقة
    -0.62
     linkovi
    -0.62
    POSITIVE LOGITS
     love
    1.31
     LOVE
    1.30
     Loves
    1.25
    LOVE
    1.25
     loves
    1.19
     Love
    1.16
    loves
    1.14
    Loves
    1.13
     loving
    1.11
    Love
    1.09
    Act Density 0.045%

    No Known Activations