INDEX
    Explanations

    reasons or explanations indicated in a sentence

    phrases that explain reasons or justifications for various situations

    New Auto-Interp
    Negative Logits
     puck
    -0.73
     Carbuncle
    -0.71
    inav
    -0.68
     sweats
    -0.66
     helicop
    -0.65
     broom
    -0.64
    chron
    -0.63
    KY
    -0.62
     Samurai
    -0.61
    eg
    -0.60
    POSITIVE LOGITS
     why
    1.16
    abl
    1.01
     WHY
    0.99
    why
    0.93
     behind
    0.78
    forward
    0.73
    Why
    0.72
     Why
    0.71
     justifying
    0.70
    ¿½
    0.70
    Act Density 0.026%

    No Known Activations