INDEX
    Explanations

    the word "cat" within text

    occurrences of the word "cat," particularly with different variations such as capitalized or combined with other words

    occurrences of the word "cat."

    New Auto-Interp
    Negative Logits
     Vander
    -0.73
     DPR
    -0.67
     Gree
    -0.67
    mble
    -0.67
     Fargo
    -0.65
     steroids
    -0.64
     Vaugh
    -0.63
     Matter
    -0.62
     Cere
    -0.61
    uden
    -0.60
    POSITIVE LOGITS
    cat
    1.35
    aclysm
    1.27
    alogue
    1.25
    alog
    1.24
    alyst
    1.22
    apult
    1.17
    hedral
    1.10
    Cat
    1.06
    cats
    1.05
    heter
    1.00
    Act Density 0.008%

    No Known Activations