INDEX
    Explanations

    phrases indicating reasons or explanations

    the word "why" used in various contexts throughout the text

    New Auto-Interp
    Negative Logits
    ãĥ¼ãĤ¯
    -0.78
    ty
    -0.67
    rop
    -0.67
    aith
    -0.67
    ãĥĥãĥĪ
    -0.63
    bow
    -0.63
    bor
    -0.62
    zman
    -0.60
    result
    -0.57
    iox
    -0.57
    POSITIVE LOGITS
     why
    0.91
     we
    0.89
     they
    0.81
    soever
    0.76
     many
    0.73
     it
    0.72
     Canaver
    0.72
    terday
    0.66
    ratulations
    0.66
     there
    0.66
    Act Density 0.048%

    No Known Activations