INDEX
Explanations
mentions of countries and regions in the text
New Auto-Interp
Negative Logits
avar
-0.17
celik
-0.15
ustralian
-0.15
-0.15
world
-0.14
Welt
-0.14
667
-0.14
readcrumb
-0.14
okable
-0.14
readcrumbs
-0.14
POSITIVE LOGITS
çļĦ大
0.20
çļĦä¸Ģ个
0.19
æľĢ
0.19
ê°Ģìŀ¥
0.18
çļĦä¸Ģ
0.17
’s
0.16
ãĤĤãģ£ãģ¨
0.16
itet
0.16
's
0.16
largest
0.16
Activations Density 0.047%