INDEX
Explanations
occurrences of proper nouns, particularly names and locations related to current events
New Auto-Interp
Negative Logits
/bower
-0.17
acimiento
-0.16
мÑĸнÑĸ
-0.16
_TD
-0.15
Desk
-0.15
885
-0.14
rana
-0.14
ừa
-0.14
ç»Ī
-0.14
Weinstein
-0.14
POSITIVE LOGITS
Lewis
0.18
Reuters
0.16
alter
0.16
елик
0.14
pus
0.14
Alter
0.14
_SKIP
0.14
succ
0.14
centr
0.14
earlier
0.14
Activations Density 0.006%