INDEX
    Explanations

    instances of the word "that."

    New Auto-Interp
    Negative Logits
    Guard
    -0.74
    lean
    -0.69
    aq
    -0.67
    oses
    -0.66
    gur
    -0.63
    ax
    -0.59
    ´
    -0.59
    aukee
    -0.59
    le
    -0.59
    respect
    -0.59
    POSITIVE LOGITS
     they
    0.99
     THEY
    0.92
     there
    0.92
    soever
    0.91
     we
    0.88
     nobody
    0.86
     unlike
    0.81
    */(
    0.80
     it
    0.78
     although
    0.78
    Act Density 0.159%

    No Known Activations