INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     dolphins
    0.57
     Dolphin
    0.56
     Dolphins
    0.55
     whale
    0.54
     whales
    0.54
     Whale
    0.53
    0.53
    0.50
     dolphin
    0.47
    🐬
    0.45
    POSITIVE LOGITS
     chicken
    1.99
     chickens
    1.98
     poultry
    1.84
    🐔
    1.82
     Chicken
    1.79
    Chicken
    1.77
    chicken
    1.74
    1.73
    1.70
    1.67
    Act Density 0.034%

    No Known Activations