INDEX
    Explanations

    references to male and female characters in various contexts

    New Auto-Interp
    Negative Logits
    plode
    -0.16
    _persona
    -0.15
    اÙħبر
    -0.15
    adelphia
    -0.15
    ocard
    -0.14
     themselves
    -0.14
    leine
    -0.14
    หม
    -0.14
     RuntimeObject
    -0.14
    zcze
    -0.14
    POSITIVE LOGITS
     named
    0.50
    named
    0.36
     whose
    0.29
     Named
    0.29
     whom
    0.29
     name
    0.28
     who
    0.28
     called
    0.27
     names
    0.27
    Named
    0.27
    Act Density 0.165%

    No Known Activations