INDEX
    Explanations

    references to sharks in various contexts, including attacks, movies, and metaphors

    New Auto-Interp
    Negative Logits
    ories
    -0.80
    ISTER
    -0.76
    haar
    -0.71
    ISION
    -0.70
     Winchester
    -0.69
    mble
    -0.67
    dit
    -0.67
    onsense
    -0.66
    ijah
    -0.65
     Hemp
    -0.63
    POSITIVE LOGITS
     fins
    0.99
     Sharks
    0.92
     sharks
    0.88
    ulic
    0.86
    mong
    0.85
    fish
    0.83
     shark
    0.82
    vati
    0.82
     Shark
    0.81
    affe
    0.79
    Act Density 0.013%

    No Known Activations