INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    getAll
    -0.07
     ascent
    -0.07
    (ax
    -0.07
    ewolf
    -0.06
     disproportionately
    -0.06
     positive
    -0.06
     llegar
    -0.06
     cyan
    -0.06
    =str
    -0.06
    	layer
    -0.06
    POSITIVE LOGITS
     Oracle
    0.07
     tablename
    0.06
    0.06
     practition
    0.06
    اری
    0.06
     tc
    0.06
    wizard
    0.06
     Scientology
    0.06
    0.06
    Oracle
    0.06
    Act Density 0.003%

    No Known Activations