INDEX
    Explanations

    mentions of nationalities or ethnic groups, particularly focusing on Chinese and Japanese references

    New Auto-Interp
    Negative Logits
    umer
    -0.15
    anan
    -0.14
    ular
    -0.14
     United
    -0.14
    ĥĿ
    -0.14
    gings
    -0.14
    407
    -0.14
    954
    -0.14
    ULAR
    -0.14
    ffffffff
    -0.13
    POSITIVE LOGITS
    -American
    0.36
    -Russian
    0.32
    -speaking
    0.28
    -Americans
    0.27
    -language
    0.24
    ischer
    0.23
    -born
    0.22
    -Israel
    0.21
    -European
    0.20
    istan
    0.20
    Act Density 0.205%

    No Known Activations