INDEX
    Explanations

    the word "There" at the beginning of sentences

    New Auto-Interp
    Negative Logits
    icial
    -0.59
    ointed
    -0.57
     Armored
    -0.55
    elta
    -0.55
     Submit
    -0.54
     Applied
    -0.54
    EA
    -0.54
     Khe
    -0.54
     Tamil
    -0.53
    submit
    -0.52
    POSITIVE LOGITS
    abouts
    1.50
    fore
    1.07
     ain
    1.05
     weren
    1.04
     aren
    1.04
    upon
    1.00
     wasn
    0.98
     isn
    0.96
    after
    0.96
    'll
    0.95
    Act Density 0.117%

    No Known Activations