INDEX
    Explanations

    the word "are" repeated multiple times

    the word "are" in various contexts

    New Auto-Interp
    Negative Logits
    OOL
    -0.61
    Nev
    -0.60
    uration
    -0.60
    erguson
    -0.59
     shape
    -0.58
    omez
    -0.58
    allery
    -0.58
    inosaur
    -0.58
    ulates
    -0.58
    ues
    -0.57
    POSITIVE LOGITS
    nce
    1.05
    nces
    1.00
    tsky
    0.90
    nes
    0.85
    than
    0.84
    nt
    0.83
    atra
    0.82
    edia
    0.81
    lli
    0.81
    tto
    0.81
    Act Density 0.014%

    No Known Activations