INDEX
    Explanations

    references to bears and bear-related terms

    New Auto-Interp
    Negative Logits
    lrrrr
    -0.77
     Pelop
    -0.69
     Rau
    -0.67
     Schur
    -0.64
    UTERS
    -0.64
     Carthag
    -0.63
     Vlas
    -0.63
    ghed
    -0.62
    Rau
    -0.62
     Lapp
    -0.61
    POSITIVE LOGITS
     Bears
    1.10
     Bear
    1.04
     BEAR
    1.01
     bears
    0.98
    Bear
    0.98
     bear
    0.94
     bearing
    0.91
    Bears
    0.90
     Bearing
    0.89
    BEAR
    0.87
    Act Density 0.419%

    No Known Activations