INDEX
    Explanations

    expressions related to preferences and opinions

    New Auto-Interp
    Negative Logits
    клопе
    -0.58
    xodo
    -0.54
     Righteous
    -0.51
    transQ
    -0.50
    -0.49
    -0.47
    AFA
    -0.47
    audiovisuel
    -0.46
     uuidv
    -0.46
     mourut
    -0.46
    POSITIVE LOGITS
     liked
    2.00
     loved
    1.78
     liking
    1.76
     love
    1.72
     loves
    1.64
     likes
    1.56
     enjoyed
    1.54
    liked
    1.51
     enjoy
    1.47
     LOVED
    1.46
    Act Density 0.309%

    No Known Activations