INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     themſelves
    -1.41
     myſelf
    -1.34
     itſelf
    -1.34
     Jefus
    -1.31
     houſe
    -1.26
     Monfieur
    -1.24
     purpoſe
    -1.20
     himſelf
    -1.19
     ſtate
    -1.19
     Efq
    -1.18
    POSITIVE LOGITS
    ly
    0.77
    ting
    0.63
     in
    0.56
     set
    0.54
    ton
    0.54
     che
    0.53
     ba
    0.50
     cho
    0.50
    mal
    0.50
    st
    0.49
    Act Density 1.714%

    No Known Activations