INDEX
    Explanations

    sentiments of love and appreciation towards a product or experience

    love and hate expressions

    New Auto-Interp
    Negative Logits
     שוליים
    -0.71
    AutoScaleMode
    -0.65
     ſche
    -0.64
     ligiloj
    -0.62
    NameInMap
    -0.60
     ſta
    -0.60
     好文分享
    -0.59
     パンチラ
    -0.58
    بوابة
    -0.57
    <unused47>
    -0.57
    POSITIVE LOGITS
     enamor
    0.44
    hate
    0.40
    love
    0.40
     love
    0.39
     LOVE
    0.38
     Love
    0.38
    Love
    0.36
     hate
    0.35
     loves
    0.35
     loved
    0.34
    Act Density 0.027%

    No Known Activations