INDEX
    Explanations

    mentions of specific cultural and national affiliations

    New Auto-Interp
    Negative Logits
    isons
    -0.15
    pts
    -0.14
     asia
    -0.14
    itu
    -0.14
    016
    -0.14
     Gram
    -0.13
    losures
    -0.13
    iku
    -0.13
    ulty
    -0.13
    irts
    -0.13
    POSITIVE LOGITS
    wegian
    0.35
    apanese
    0.34
    namese
    0.34
    inese
    0.33
    uvian
    0.33
    sonian
    0.32
    anian
    0.30
    onian
    0.29
    ussian
    0.28
    orean
    0.28
    Act Density 0.202%

    No Known Activations