INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    s
    -0.65
    m
    -0.60
    shop
    -0.51
     m
    -0.48
    n
    -0.48
    sz
    -0.47
    sho
    -0.45
    b
    -0.44
    sha
    -0.44
    bag
    -0.44
    POSITIVE LOGITS
     Efq
    1.05
     Monfieur
    1.00
     itſelf
    0.99
     Shakspeare
    0.98
     Cæsar
    0.98
     myſelf
    0.96
     ſeveral
    0.92
     Jefus
    0.91
     whoſe
    0.91
     Majefty
    0.90
    Act Density 0.454%

    No Known Activations