INDEX
    Explanations

    variability and streams

    New Auto-Interp
    Negative Logits
     poffe
    -0.91
     myſelf
    -0.90
     Efq
    -0.89
     itſelf
    -0.88
     reaſon
    -0.81
     poffible
    -0.79
     themſelves
    -0.78
     Majefty
    -0.78
     ſtand
    -0.78
     pleaſure
    -0.77
    POSITIVE LOGITS
    s
    0.90
    ers
    0.74
    ron
    0.69
    ry
    0.66
    ual
    0.62
    sun
    0.61
    r
    0.60
    ings
    0.60
    ative
    0.59
    son
    0.58
    Act Density 0.189%

    No Known Activations