INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -write
    -0.08
     investigate
    -0.08
    thi
    -0.07
    gable
    -0.07
    heets
    -0.07
    -0.07
     EMC
    -0.07
    ټو
    -0.07
    كتب
    -0.07
    dagen
    -0.07
    POSITIVE LOGITS
     эффект
    0.10
     Legion
    0.09
     ontbreken
    0.08
     легко
    0.08
     easiest
    0.08
     окруж
    0.08
    anky
    0.07
     facilement
    0.07
     Sold
    0.07
     आसानी
    0.07
    Act Density 0.001%

    No Known Activations