INDEX
    Explanations

    articles, specifically focusing on the presence of specific determiners like "a," "an," and "the."

    New Auto-Interp
    Negative Logits
     betweenstory
    -1.58
     myſelf
    -1.53
    ^(@)
    -1.51
     Monfieur
    -1.46
     Efq
    -1.41
     Jefus
    -1.40
     houſe
    -1.40
     Majefty
    -1.40
     itſelf
    -1.40
    BibitemShut
    -1.39
    POSITIVE LOGITS
    '
    0.98
    0.95
    ,
    0.94
    0.92
    -
    0.90
    :
    0.89
    )
    0.89
    ↵↵
    0.85
    ),
    0.85
     (
    0.84
    Act Density 0.337%

    No Known Activations