INDEX
    Explanations

    article followed by specific phrase

    structured, formal writing cues such as numbered items, bullet points, section headers, and policy/instruction-style phrasing.

    New Auto-Interp
    Negative Logits
     belladone
    0.22
     hadron
    0.22
     Referential
    0.22
     Fernseh
    0.21
    cadherin
    0.21
     superposition
    0.21
    وسیع
    0.20
     ferritin
    0.20
     bakteri
    0.20
     déficit
    0.20
    POSITIVE LOGITS
    У
    0.25
    В
    0.21
    Р
    0.21
     спорта
    0.21
     у
    0.20
     М
    0.20
     Му
    0.20
    0.19
    К
    0.19
    AR
    0.19
    Act Density 1.027%

    No Known Activations