INDEX
    Explanations

    references to Indonesia or Indonesian identity

    New Auto-Interp
    Negative Logits
     للمعارف
    -0.63
     keinem
    -0.53
     sammen
    -0.52
    -0.50
    Lähteet
    -0.50
    iprot
    -0.49
     wachten
    -0.49
    ريل
    -0.49
    político
    -0.49
     hoffen
    -0.48
    POSITIVE LOGITS
     Indonesia
    0.92
    Indonesia
    0.85
     Indonesian
    0.76
    AddTagHelper
    0.76
     Indones
    0.71
    ertale
    0.67
     ID
    0.65
    JAKARTA
    0.65
     indonesia
    0.65
    UnsafeEnabled
    0.65
    Act Density 0.055%

    No Known Activations