INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ertia
    -0.95
    开学
    -0.92
    vaders
    -0.89
     Pils
    -0.83
     LESSON
    -0.82
     MONTH
    -0.81
     WEEKLY
    -0.81
     pompe
    -0.81
    bedingungen
    -0.80
    спользова
    -0.79
    POSITIVE LOGITS
     interview
    2.66
     interviewer
    2.16
     interviewers
    2.11
     Interview
    2.06
    Interview
    1.91
     interviewing
    1.84
    interview
    1.80
     interviews
    1.64
    面试
    1.62
    Interviewer
    1.60
    Act Density 0.021%

    No Known Activations