INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     злоч
    -0.07
     TAM
    -0.07
    .getElementsByName
    -0.06
     Nate
    -0.06
    енные
    -0.06
    ang
    -0.06
    Avg
    -0.06
     inve
    -0.06
    ::{↵
    -0.06
    POSITIVE LOGITS
     Doctor
    0.16
     doctor
    0.12
    Doctor
    0.11
     Doctors
    0.11
     doctors
    0.11
     doctoral
    0.08
     Dor
    0.08
     dont
    0.08
    doctor
    0.08
     boot
    0.08
    Act Density 0.007%

    No Known Activations