INDEX
    Explanations

    questions or statements indicating confusion or requesting clarification

    instances of the word "why" and its contextual usage suggesting reasoning or explanations

    New Auto-Interp
    Negative Logits
    lator
    -0.79
     Roller
    -0.77
    ymph
    -0.74
    ãĤ¤ãĥĪ
    -0.74
    rop
    -0.63
    thus
    -0.63
    ãĤ¹
    -0.61
    opic
    -0.61
    aughed
    -0.61
    shaw
    -0.61
    POSITIVE LOGITS
    soever
    1.00
     why
    0.89
    why
    0.79
     WHY
    0.77
     exactly
    0.68
    Origin
    0.68
    ihad
    0.65
    abl
    0.65
    abouts
    0.65
    eve
    0.64
    Act Density 0.036%

    No Known Activations