INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Araştır
    -0.07
    [line
    -0.07
     Yüksek
    -0.07
    -0.06
     qualité
    -0.06
     Shields
    -0.06
     Omar
    -0.06
     corrected
    -0.06
     comparing
    -0.06
    ován
    -0.06
    POSITIVE LOGITS
     participant
    0.09
     partner
    0.08
    参与
    0.08
    Attend
    0.07
     Participants
    0.07
     witnessed
    0.07
    BI
    0.07
     Harmon
    0.07
     injust
    0.06
     IMPLEMENT
    0.06
    Act Density 0.036%

    No Known Activations