INDEX
    Explanations

    articles and determiners in sentences

    New Auto-Interp
    Negative Logits
     houſe
    -1.00
     itſelf
    -0.90
     fubject
    -0.84
     ſtate
    -0.84
     myſelf
    -0.80
     propOrder
    -0.79
     purpoſe
    -0.79
     pleaſure
    -0.78
     ſch
    -0.75
     Majefty
    -0.75
    POSITIVE LOGITS
     der
    1.18
     Die
    0.99
     Der
    0.95
     den
    0.92
    Der
    0.89
    Die
    0.88
     die
    0.83
     Den
    0.82
     ihrer
    0.82
     Ihrer
    0.79
    Act Density 0.016%

    No Known Activations