INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     betre
    -0.10
     Mig
    -0.08
     Maui
    -0.08
    pand
    -0.07
     harsh
    -0.07
    young
    -0.07
    inar
    -0.07
    -0.07
     Pandemie
    -0.07
     Jig
    -0.07
    POSITIVE LOGITS
    0.08
    om
    0.08
     Campbell
    0.07
     ensemble
    0.07
     oeuvre
    0.07
    س
    0.07
     medal
    0.07
     diapers
    0.07
     ensembles
    0.07
     templo
    0.07
    Act Density 0.002%

    No Known Activations