INDEX
    Explanations

    occurrences of the word "A" and emphasizes possessive or plural pronouns

    New Auto-Interp
    Negative Logits
    opa
    -0.14
    avou
    -0.14
    ilon
    -0.14
    ipa
    -0.14
    ellt
    -0.14
    rike
    -0.13
    ymology
    -0.13
    alous
    -0.13
     Affero
    -0.13
    ihn
    -0.13
    POSITIVE LOGITS
    bam
    0.14
    etiyle
    0.14
    ledon
    0.13
    EMU
    0.13
    iet
    0.13
    atrix
    0.13
    itti
    0.13
    ulu
    0.13
    .wikipedia
    0.13
    anela
    0.13
    Act Density 0.058%

    No Known Activations