INDEX
    Explanations

    references to specific subjects and pronouns in the text

    New Auto-Interp
    Negative Logits
    ocks
    -0.18
    swire
    -0.17
    ance
    -0.15
    rang
    -0.15
     ed
    -0.14
    vals
    -0.14
     Diaz
    -0.14
     Nej
    -0.14
    orst
    -0.14
    ÑģÑĤин
    -0.14
    POSITIVE LOGITS
    '&&
    0.15
    ritch
    0.15
    ¤¤
    0.15
    éry
    0.14
    flater
    0.13
     absent
    0.13
    isse
    0.13
    _TA
    0.13
    pond
    0.13
     Boeh
    0.13
    Act Density 0.144%

    No Known Activations