INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     itſelf
    -1.33
     myſelf
    -1.24
     raiſ
    -1.16
    -1.15
     Phry
    -1.15
     Mahomet
    -1.13
     pleaſure
    -1.12
     Houſe
    -1.12
     doubtnut
    -1.11
     Shakspeare
    -1.09
    POSITIVE LOGITS
     the
    2.01
     The
    1.45
    The
    1.30
     THE
    1.28
     same
    1.14
    the
    1.05
     a
    1.01
    ethe
    0.98
     new
    0.97
    enthe
    0.96
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.