INDEX
    Explanations

    phrases related to geopolitical regions, especially those experiencing conflict

    terms related to ethnic groups

    New Auto-Interp
    Negative Logits
    
    -0.79
    sidemargin
    -0.77
     Wikimedijinoj
    -0.71
    CodedInputStream
    -0.70
    олові
    -0.69
    -0.65
     ویکی‌پدی
    -0.65
    LabelTagHelper
    -0.65
    tagHelperRunner
    -0.64
    cerely
    -0.64
    POSITIVE LOGITS
    !=-
    0.50
    iga
    0.48
    feer
    0.47
     Federation
    0.47
     federation
    0.47
    stak
    0.46
     feder
    0.46
     entity
    0.45
    faden
    0.45
     Fed
    0.44
    Act Density 0.629%

    No Known Activations