INDEX
    Explanations

    words related to specific animals like penguins and tigers

    references to specific animals or characters

    New Auto-Interp
    Negative Logits
    erest
    -0.86
    lessly
    -0.83
    neau
    -0.76
    arnaev
    -0.75
    iggins
    -0.74
    bender
    -0.71
    urst
    -0.69
    mble
    -0.69
    ites
    -0.67
    staking
    -0.65
    POSITIVE LOGITS
     Doodle
    0.94
     pengu
    0.89
    eday
    0.79
    cean
    0.76
     pige
    0.73
     Pengu
    0.71
     Penguin
    0.69
     retri
    0.68
     Britann
    0.67
     Alley
    0.67
    Act Density 0.023%

    No Known Activations