INDEX
    Explanations

    people's names and specific job titles

    New Auto-Interp
    Negative Logits
     himself
    -0.85
     his
    -0.71
    himself
    -0.69
    FetchType
    -0.69
     חיצוניים
    -0.67
     होती
    -0.66
    his
    -0.66
    His
    -0.65
     seinen
    -0.64
     होगी
    -0.63
    POSITIVE LOGITS
     depic
    1.71
     maneu
    1.71
     strick
    1.67
     fta
    1.66
     shenan
    1.64
     thut
    1.60
     inev
    1.59
     ftu
    1.58
     aen
    1.57
     accla
    1.57
    Act Density 0.455%

    No Known Activations