INDEX
Negative Logits
Uber
-0.07
пораж
-0.06
П
-0.06
Bard
-0.06
Wifi
-0.06
lowering
-0.06
Loft
-0.06
Healthcare
-0.06
Wow
-0.06
VIP
-0.06
POSITIVE LOGITS
pronunciation
0.07
_lib
0.06
漫
0.06
(condition
0.06
errar
0.06
_hist
0.06
debugging
0.06
)set
0.06
racial
0.06
horse
0.06
Activations Density 0.015%