INDEX
    Explanations

    references to political figures and institutions

    New Auto-Interp
    Negative Logits
    eyse
    -0.16
    ropoda
    -0.16
    idor
    -0.16
    utin
    -0.16
    ruba
    -0.15
    jure
    -0.15
    ochen
    -0.15
    ecure
    -0.14
     eldre
    -0.14
    æ¦ľ
    -0.14
    POSITIVE LOGITS
     who
    0.62
     whose
    0.46
    who
    0.43
     whom
    0.35
    whose
    0.33
     Who
    0.33
     with
    0.32
    Who
    0.30
     quien
    0.29
     qui
    0.28
    Act Density 0.640%

    No Known Activations