INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Owens
    -0.09
     bino
    -0.09
     heraus
    -0.08
     yay
    -0.08
    hape
    -0.07
     arbeitet
    -0.07
    -0.07
    conversion
    -0.07
     Meer
    -0.07
    -0.07
    POSITIVE LOGITS
     importantly
    0.08
    REP
    0.07
    za
    0.07
     intelligent
    0.07
     Buffet
    0.07
     Fris
    0.07
    Cb
    0.07
     idem
    0.07
     Fer
    0.07
     William
    0.07
    Act Density 0.166%

    No Known Activations