INDEX
    Explanations

    mentions of countries and regions in the text

    New Auto-Interp
    Negative Logits
    avar
    -0.17
    celik
    -0.15
    ustralian
    -0.15
    @qq
    -0.15
     world
    -0.14
     Welt
    -0.14
    667
    -0.14
    readcrumb
    -0.14
    okable
    -0.14
    readcrumbs
    -0.14
    POSITIVE LOGITS
    çļĦ大
    0.20
    çļĦä¸Ģ个
    0.19
    æľĢ
    0.19
     ê°Ģìŀ¥
    0.18
    çļĦä¸Ģ
    0.17
    ’s
    0.16
    ãĤĤãģ£ãģ¨
    0.16
    itet
    0.16
    's
    0.16
     largest
    0.16
    Act Density 0.047%

    No Known Activations