INDEX
    Explanations

    references to female characters and their relationships within stories

    New Auto-Interp
    Negative Logits
    fter
    -0.17
    hi
    -0.16
    ele
    -0.14
    elper
    -0.14
    nge
    -0.14
    ing
    -0.14
    hips
    -0.14
    odem
    -0.14
    quiv
    -0.14
    arer
    -0.14
    POSITIVE LOGITS
    afort
    0.16
    å¡ļ
    0.14
    //*[@
    0.14
    Ñįй
    0.14
    /Branch
    0.13
    annes
    0.13
    "urls
    0.13
    urdu
    0.13
    617
    0.13
    AFX
    0.13
    Act Density 0.389%

    No Known Activations