INDEX
    Explanations

    mentions of bears

    references to bears in various contexts

    New Auto-Interp
    Negative Logits
    anwhile
    -0.85
    lectic
    -0.81
    ij士
    -0.74
    ADRA
    -0.73
    icut
    -0.72
    inx
    -0.72
    enta
    -0.71
    arta
    -0.71
    ocrates
    -0.71
    yrim
    -0.70
    POSITIVE LOGITS
    bear
    1.00
     cub
    0.94
     claws
    0.91
     paws
    0.90
     hugs
    0.89
     Bears
    0.87
     paw
    0.86
     Grizz
    0.85
     hug
    0.83
    beit
    0.78
    Act Density 0.011%

    No Known Activations