INDEX
    Explanations

    mentions of the word "penguins"

    New Auto-Interp
    Negative Logits
    WORK
    -0.82
    ysis
    -0.80
    puter
    -0.73
    phas
    -0.71
    usted
    -0.66
    ¿½
    -0.66
     ILCS
    -0.65
    neau
    -0.65
    parts
    -0.65
    lder
    -0.64
    POSITIVE LOGITS
    insula
    0.99
     Penguins
    0.86
     pengu
    0.82
    aukee
    0.77
    atoon
    0.77
    engu
    0.74
    keye
    0.72
     Pengu
    0.72
    unia
    0.71
     Hots
    0.67
    Act Density 0.025%

    No Known Activations