INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Projects
    -0.08
    ЎыџN
    -0.07
     witches
    -0.07
    роничес
    -0.07
     Books
    -0.07
     Abrams
    -0.07
     Newman
    -0.06
     Nature
    -0.06
    .org
    -0.06
     ejected
    -0.06
    POSITIVE LOGITS
     salary
    0.14
     Salary
    0.12
    Salary
    0.10
     salaries
    0.10
    salary
    0.10
    alary
    0.07
    alley
    0.07
     tariff
    0.07
    arr
    0.07
    ilib
    0.07
    Act Density 0.005%

    No Known Activations