INDEX
    Explanations

    mentions of influential political figures and educational institutions

    New Auto-Interp
    Negative Logits
    ereotype
    -0.15
    allo
    -0.14
    utsch
    -0.14
    nilai
    -0.14
    ando
    -0.13
    intl
    -0.13
    inka
    -0.13
    .magic
    -0.13
    ore
    -0.13
    ording
    -0.12
    POSITIVE LOGITS
    #aa
    0.15
    ç·Ĵ
    0.14
    498
    0.14
     RequestOptions
    0.13
    884
    0.13
    lasses
    0.13
    ownik
    0.13
     Rodgers
    0.13
    áce
    0.13
     saf
    0.12
    Act Density 0.034%

    No Known Activations