INDEX
    Explanations

    words related to prediction or warning

    instances of the word "fore" in various contexts

    New Auto-Interp
    Negative Logits
    BuyableInstoreAndOnline
    -0.84
    REDACTED
    -0.79
    RED
    -0.77
    IRO
    -0.73
    å°Ĩ
    -0.73
     Stain
    -0.70
    OPLE
    -0.67
     Ou
    -0.66
     Franks
    -0.65
     Shed
    -0.63
    POSITIVE LOGITS
    nsics
    1.09
    shadow
    1.07
    warn
    1.02
    told
    1.01
    father
    1.01
    warning
    0.99
     fore
    0.97
    runner
    0.97
    sight
    0.96
    shore
    0.95
    Act Density 0.007%

    No Known Activations