INDEX
    Explanations

    instances of the word "that."

    New Auto-Interp
    Negative Logits
    pery
    -0.18
    ãĤ¤ãĥ¤
    -0.18
    BuilderFactory
    -0.17
     navr
    -0.16
    deaux
    -0.16
    iaux
    -0.16
     Means
    -0.16
    rud
    -0.15
    theid
    -0.15
    eid
    -0.14
    POSITIVE LOGITS
     way
    0.56
    -way
    0.37
    way
    0.35
     Way
    0.34
    _way
    0.32
    .way
    0.30
     WAY
    0.29
    away
    0.29
    Way
    0.28
     direction
    0.28
    Act Density 0.018%

    No Known Activations