INDEX
    Explanations

    references to Jewish identity and the Jewish community

    New Auto-Interp
    Negative Logits
    Ñıк
    -0.16
     jet
    -0.15
     extinction
    -0.15
    imoto
    -0.14
    utton
    -0.14
     Dort
    -0.14
    731
    -0.14
     Reserve
    -0.14
    xef
    -0.14
    éĺ¶
    -0.14
    POSITIVE LOGITS
    ewish
    0.35
    ews
    0.34
    uda
    0.32
    EW
    0.28
    ewis
    0.27
     ew
    0.26
    ew
    0.26
    UDA
    0.25
    ewn
    0.23
    ude
    0.23
    Act Density 0.007%

    No Known Activations