INDEX
Negative Logits
imagination
-0.07
бла
-0.07
(full
-0.07
dition
-0.07
問題
-0.06
害
-0.06
そこ
-0.06
(direction
-0.06
νι
-0.06
ден
-0.06
POSITIVE LOGITS
Yelp
0.12
Zuckerberg
0.06
ăn
0.06
stat
0.06
Tim
0.06
Occupy
0.06
ुप
0.06
XS
0.06
roommate
0.06
etype
0.06
Activations Density 0.001%