INDEX
    Explanations

    the word "have"

    New Auto-Interp
    Negative Logits
     Apart
    -0.73
    catentry
    -0.64
    oshi
    -0.59
    icking
    -0.58
     Crown
    -0.55
    eem
    -0.55
    arter
    -0.54
    TG
    -0.54
     recalls
    -0.54
     dodge
    -0.53
    POSITIVE LOGITS
     been
    1.37
    been
    1.19
     Been
    1.02
     gotten
    1.00
     seen
    0.93
     done
    0.90
     undergone
    0.89
     begun
    0.88
     become
    0.87
     gone
    0.86
    Act Density 0.321%

    No Known Activations