INDEX
    Explanations

    phrases that introduce hypothetical scenarios or assumptions

    New Auto-Interp
    Negative Logits
    -1.33
     (
    -1.10
    .
    -1.08
     "
    -1.07
    <eos>
    -1.06
     The
    -1.02
    -1.02
     A
    -1.01
    ↵↵
    -1.00
     B
    -0.98
    POSITIVE LOGITS
     myſelf
    2.00
     Efq
    1.98
     itſelf
    1.92
     purpoſe
    1.84
     pleaſure
    1.83
     houſe
    1.81
     becauſe
    1.81
     Monfieur
    1.80
     ſtate
    1.80
     Jefus
    1.79
    Act Density 0.232%

    No Known Activations