INDEX
    Explanations

    phrases indicating health risks or medical recommendations

    New Auto-Interp
    Negative Logits
     pinulongan
    -1.26
    mybatisplus
    -1.22
     EconPapers
    -1.14
    Билгалдахарш
    -1.12
    最快更新
    -1.11
     للمعارف
    -1.09
    Vidite
    -1.09
    Filmografie
    -1.08
    تقاوى
    -1.08
     GenerationType
    -1.07
    POSITIVE LOGITS
    0.71
    ↵↵
    0.70
     you
    0.57
    </em>
    0.55
    .
    0.54
     You
    0.54
    The
    0.52
     is
    0.52
    You
    0.51
     and
    0.50
    Act Density 2.169%

    No Known Activations