INDEX
    Explanations

    questions and answers

    New Auto-Interp
    Negative Logits
     شهر
    -0.07
    -0.06
    ificação
    -0.06
     alan
    -0.06
    Com
    -0.06
    Jos
    -0.06
     partisan
    -0.06
     fren
    -0.06
    สาว
    -0.06
    _dis
    -0.06
    POSITIVE LOGITS
    %);↵
    0.07
    >(
    0.07
     way
    0.06
    0.06
     gesture
    0.06
     الشيخ
    0.06
     KEY
    0.06
    θεια
    0.06
     Sms
    0.06
    이는
    0.06
    Act Density 0.022%

    No Known Activations