INDEX
    Explanations

    phrases related to providing explanations or justifications

    multiple instances of the word "reasons" indicating various justifications or causes

    New Auto-Interp
    Negative Logits
    yss
    -0.68
     Winged
    -0.68
     puck
    -0.67
     franc
    -0.66
     Pixie
    -0.66
     needle
    -0.64
    ream
    -0.63
     Mous
    -0.63
    enged
    -0.62
    esc
    -0.62
    POSITIVE LOGITS
     why
    0.91
     reasons
    0.89
    cale
    0.84
     WHY
    0.84
    pointers
    0.83
     arguments
    0.80
    æĦ
    0.78
     justifying
    0.78
    why
    0.77
     explanations
    0.75
    Act Density 0.020%

    No Known Activations