INDEX
    Explanations

    references to nationalities and countries, specifically focusing on Chinese and Swedish entities

    New Auto-Interp
    Negative Logits
    gings
    -0.15
    bidden
    -0.15
    alling
    -0.14
    UBLISH
    -0.14
     reput
    -0.14
    aller
    -0.14
    lect
    -0.14
    ä¸Ńåľĭ
    -0.14
    enton
    -0.14
    atr
    -0.13
    POSITIVE LOGITS
    -American
    0.43
    -speaking
    0.34
    -Americans
    0.33
    -Russian
    0.33
    -born
    0.30
    -language
    0.30
    -Israel
    0.28
    -flag
    0.25
    ischer
    0.25
    -made
    0.24
    Act Density 0.233%

    No Known Activations