INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Ginger
    -0.07
                                                                                   
    -0.07
    (is
    -0.06
    06
    -0.06
    ugh
    -0.06
    ивает
    -0.06
     citt
    -0.06
     cy
    -0.06
     Copper
    -0.06
     fossils
    -0.06
    POSITIVE LOGITS
     pred
    0.15
    Pred
    0.11
     Pred
    0.11
    pred
    0.10
    .pred
    0.09
    _pred
    0.08
    (pred
    0.07
     contrad
    0.07
    दम
    0.07
     jeden
    0.07
    Act Density 0.003%

    No Known Activations