INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     هفت
    -0.06
     histo
    -0.06
    'h
    -0.06
    роничес
    -0.06
    اشت
    -0.06
    (flow
    -0.06
     prosince
    -0.06
     очеред
    -0.06
     مشارکت
    -0.06
    orro
    -0.06
    POSITIVE LOGITS
     medical
    0.09
    medical
    0.08
     Medical
    0.07
     MED
    0.07
     Alabama
    0.07
     education
    0.07
     медицин
    0.07
    英语
    0.07
     safe
    0.07
    ammer
    0.07
    Act Density 0.033%

    No Known Activations