INDEX
    Explanations

    phrases related to causation or explanation using the word "because."

    repetitive mentions of the word "the."

    New Auto-Interp
    Negative Logits
    vernment
    -0.76
    dale
    -0.67
    advertising
    -0.67
    DB
    -0.66
    ira
    -0.64
    mares
    -0.63
    ancer
    -0.62
    ojure
    -0.60
    shaw
    -0.59
    edia
    -0.58
    POSITIVE LOGITS
     result
    1.35
     same
    1.31
    same
    1.21
     culmination
    1.13
    ologically
    1.12
     ones
    1.10
     envy
    1.09
     equivalent
    1.08
     hardest
    1.06
     easiest
    1.01
    Act Density 0.137%

    No Known Activations