INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ſelf
    -0.99
     Majefty
    -0.94
     Efq
    -0.89
    ]--;
    -0.88
     Jefus
    -0.84
    ſelves
    -0.82
    Means
    -0.82
     houſe
    -0.82
     greateſt
    -0.82
     Houſe
    -0.81
    POSITIVE LOGITS
    e
    0.86
    o
    0.72
    a
    0.72
    y
    0.62
     of
    0.60
    ever
    0.60
    el
    0.51
    how
    0.50
     it
    0.49
    top
    0.49
    Act Density 0.673%

    No Known Activations