INDEX
    Explanations

    references to political figures and their affiliations

    New Auto-Interp
    Negative Logits
    zes
    -0.15
    uet
    -0.15
     humor
    -0.14
    ucz
    -0.14
    iske
    -0.14
    utsch
    -0.14
    ôt
    -0.14
    pole
    -0.14
    ursal
    -0.13
    ius
    -0.13
    POSITIVE LOGITS
    opyright
    0.15
    çī
    0.14
    èĩ
    0.14
    à¹Ģล
    0.14
     clich
    0.14
    edn
    0.14
    roid
    0.13
     Surre
    0.13
    olen
    0.13
    аÑĤÑĸ
    0.13
    Act Density 0.022%

    No Known Activations