INDEX
    Explanations

    pronouns and references to individuals in various contexts

    New Auto-Interp
    Negative Logits
    lete
    -0.15
    .utilities
    -0.14
     sel
    -0.14
    alsex
    -0.14
    isku
    -0.13
    ovit
    -0.13
    asted
    -0.13
    lite
    -0.13
    lg
    -0.13
    ellt
    -0.13
    POSITIVE LOGITS
    rganization
    0.14
    iver
    0.14
    ideo
    0.14
    857
    0.14
    rott
    0.13
     opposed
    0.13
    theast
    0.13
    é®®
    0.13
    eyJ
    0.13
    ottom
    0.13
    Act Density 0.055%

    No Known Activations