INDEX
    Explanations

    sequences of whitespace characters

    New Auto-Interp
    Negative Logits
    -0.71
    -0.60
     to
    -0.58
    <eos>
    -0.53
    '
    -0.53
     are
    -0.52
     '
    -0.52
    prom
    -0.52
     h
    -0.51
     lo
    -0.49
    POSITIVE LOGITS
     pleaſure
    1.28
     purpoſe
    1.18
     uſe
    1.13
     myſelf
    1.12
     ſtate
    1.10
     itſelf
    1.07
     houſe
    1.05
     ſever
    1.05
     ſmall
    1.02
     greateſt
    1.01
    Act Density 0.155%

    No Known Activations