INDEX
    Explanations

    criticism or opposition

    New Auto-Interp
    Negative Logits
     candid
    -0.07
    livé
    -0.06
     Бол
    -0.06
    ancellable
    -0.06
    CDF
    -0.06
    нич
    -0.06
     duyệt
    -0.06
    -0.06
     환산
    -0.06
    -0.06
    POSITIVE LOGITS
    Jane
    0.06
     polite
    0.06
     Managers
    0.06
    ociety
    0.06
    جن
    0.06
    ім
    0.06
     الشيخ
    0.06
    "use
    0.06
    phy
    0.06
     wrought
    0.06
    Act Density 0.153%

    No Known Activations