INDEX
    Explanations

    phrases related to various reasons and explanations

    references to significant reasons or explanations within the text

    New Auto-Interp
    Negative Logits
    IELD
    -0.66
    taker
    -0.62
    owers
    -0.54
    riage
    -0.54
    ERROR
    -0.54
    ocl
    -0.53
    exit
    -0.53
    OWER
    -0.53
    å§
    -0.53
     keeper
    -0.53
    POSITIVE LOGITS
     ranging
    1.61
     include
    1.41
     varied
    1.40
     ranged
    1.33
     includ
    1.31
     including
    1.24
    including
    1.23
     summarized
    1.21
     varying
    1.19
    ranging
    1.15
    Act Density 0.882%

    No Known Activations