INDEX
    Explanations

    references to influential figures and their actions within a socio-political context

    New Auto-Interp
    Negative Logits
    ÅĽÄĩ
    -0.17
    -Men
    -0.16
    ROID
    -0.16
     داخ
    -0.15
    ãĥ¼ãĥ³
    -0.15
    šli
    -0.15
     serr
    -0.15
     Lanka
    -0.15
    suppress
    -0.14
    .localized
    -0.14
    POSITIVE LOGITS
    inder
    0.33
     Singh
    0.29
    Sing
    0.27
    INDER
    0.27
     sing
    0.27
     Gill
    0.26
    jit
    0.25
    bir
    0.24
    pre
    0.23
     Parm
    0.23
    Act Density 0.057%

    No Known Activations