INDEX
    Explanations

    phrases relating to being unharmed or safe

    mentions of the word "har" or related variations, possibly indicating a focus on the term's usage in various contexts

    New Auto-Interp
    Negative Logits
    htaking
    -0.91
    lect
    -0.81
    essee
    -0.78
    uring
    -0.70
    hift
    -0.70
    BOOK
    -0.70
    urally
    -0.69
    olve
    -0.67
     fuzz
    -0.64
     chalk
    -0.64
    POSITIVE LOGITS
    assment
    1.01
    tha
    0.99
    allel
    0.93
     Tsarnaev
    0.91
    vard
    0.90
    riors
    0.89
    riage
    0.85
    riages
    0.84
    ashtra
    0.82
    adow
    0.80
    Act Density 0.066%

    No Known Activations