INDEX
    Explanations

    instances of the pronoun 'she' and related variations

    New Auto-Interp
    Negative Logits
    li
    -0.19
    ced
    -0.17
    line
    -0.16
    ri
    -0.15
    nya
    -0.15
    ric
    -0.15
    met
    -0.15
    ayette
    -0.15
    wahl
    -0.14
    ways
    -0.14
    POSITIVE LOGITS
    oji
    0.18
    iad
    0.16
    a
    0.16
    iene
    0.16
    een
    0.15
    elem
    0.15
    .Handled
    0.15
    ihu
    0.15
    iÃŁ
    0.15
    661
    0.15
    Act Density 0.054%

    No Known Activations