INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    flower
    -0.08
     Pal
    -0.08
    brush
    -0.08
     Acceptance
    -0.08
    Pal
    -0.08
    री
    -0.08
     PEN
    -0.08
    -0.07
    awn
    -0.07
     Merkel
    -0.07
    POSITIVE LOGITS
    mf
    0.09
     demais
    0.09
    ifies
    0.08
     asunto
    0.08
     ours
    0.08
     optar
    0.08
    Incoming
    0.07
     than
    0.07
     issu
    0.07
    (or
    0.07
    Act Density 0.022%

    No Known Activations