INDEX
    Explanations

    either "like the" or abbreviations/short informal words

    media/entertainment

    New Auto-Interp
    Negative Logits
    ########.
    -0.93
     بيها
    -0.79
     مشين
    -0.75
    AutoScaleMode
    -0.69
    enumii
    -0.67
    رشف
    -0.66
    enumi
    -0.65
     gangen
    -0.64
    NameInMap
    -0.63
    makeConstraints
    -0.63
    POSITIVE LOGITS
     like
    2.17
    like
    1.94
    Like
    1.90
     Like
    1.89
     LIKE
    1.77
    LIKE
    1.70
     likes
    1.28
     seperti
    1.23
     Seperti
    1.16
     như
    1.15
    Act Density 0.500%

    No Known Activations