INDEX
    Explanations

    connections and relationships in reasoning or explanations

    New Auto-Interp
    Negative Logits
     myſelf
    -1.03
     purpoſe
    -1.02
     fubject
    -1.00
     Shakspeare
    -0.98
     ſtate
    -0.96
     Reſ
    -0.94
     Majefty
    -0.93
     itſelf
    -0.91
     greateſt
    -0.90
     ſever
    -0.89
    POSITIVE LOGITS
     the
    0.89
     a
    0.77
    :
    0.70
     an
    0.69
     several
    0.63
     adanya
    0.57
     two
    0.56
     faptul
    0.56
     that
    0.56
     some
    0.56
    Act Density 0.732%

    No Known Activations