INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     clinicians
    -0.08
    -0.07
     Black
    -0.07
     clinician
    -0.07
    mynd
    -0.07
     mourn
    -0.07
     fever
    -0.07
    ptune
    -0.07
    ocaine
    -0.07
     भूल
    -0.07
    POSITIVE LOGITS
     nok
    0.09
     noko
    0.09
     Nes
    0.08
     recipro
    0.08
     alak
    0.08
     trik
    0.08
     выраж
    0.08
     expans
    0.08
     sug
    0.07
    upal
    0.07
    Act Density 0.003%

    No Known Activations