INDEX
Explanations
references to products and ownership
New Auto-Interp
Negative Logits
ileÅŁ
-0.18
è§Ĵèī²
-0.18
ãĤŃãĥ¼
-0.16
çiler
-0.16
ãĥīãĥ«
-0.16
ÅĽmy
-0.15
kendisi
-0.15
åħ¥ãĤĬ
-0.15
夫人
-0.15
him
-0.14
POSITIVE LOGITS
人çļĦ
0.45
èĢħçļĦ
0.41
çĶŁçļĦ
0.40
åĬĽçļĦ
0.38
åŃIJçļĦ
0.37
å®¶çļĦ
0.37
ìŀIJìĿĺ
0.35
ìĤ¬ìĿĺ
0.33
”çļĦ
0.31
기ìĿĺ
0.31
Activations Density 0.464%