INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    𝓬
    1.00
     accion
    0.95
     yattha
    0.95
     are
    0.88
    0.86
    0.86
     honti
    0.85
     appelée
    0.84
     chiamato
    0.83
     Sereth
    0.83
    POSITIVE LOGITS
    ä
    1.27
    em
    1.09
    ut
    1.00
     Chicken
    0.99
    Chicken
    0.95
     chicken
    0.95
    🐔
    0.94
    endes
    0.89
    il
    0.89
    0.87
    Act Density 0.020%

    No Known Activations