INDEX
    Explanations

    statements or questions asking for reasons or explanations

    instances of the word "why" indicating inquiries or explanations

    New Auto-Interp
    Negative Logits
     Roller
    -0.74
    ymph
    -0.71
    ages
    -0.70
    trop
    -0.68
    rop
    -0.66
    robe
    -0.64
    amps
    -0.63
     puck
    -0.63
    field
    -0.62
     Sailor
    -0.62
    POSITIVE LOGITS
    soever
    1.16
     why
    1.14
    why
    1.08
     WHY
    1.01
    iterranean
    0.84
    ihad
    0.83
    Why
    0.82
     exactly
    0.77
    tical
    0.75
    utterstock
    0.74
    Act Density 0.036%

    No Known Activations