INDEX
    Explanations

    expressions of positivity and appreciation

    New Auto-Interp
    Negative Logits
    IsContent
    -0.72
     credere
    -0.69
     Sane
    -0.66
     trả
    -0.65
    Xk
    -0.61
     NUE
    -0.60
    PickerController
    -0.59
     Cuevas
    -0.58
    Def
    -0.58
    le
    -0.58
    POSITIVE LOGITS
     lovely
    1.52
    lovely
    1.48
    Lovely
    1.43
    Wonderful
    1.39
     Lovely
    1.39
     Wonderful
    1.32
     wonderful
    1.28
    wonderful
    1.25
    ✨:
    0.99
     maravilloso
    0.95
    Act Density 0.068%

    No Known Activations