INDEX
Negative Logits
cheek
-0.08
乙
-0.08
万
-0.07
Dmit
-0.07
怒
-0.07
acquainted
-0.07
fortnight
-0.07
spouse
-0.07
Tuesday
-0.07
Donnerstag
-0.07
POSITIVE LOGITS
-specific
0.11
-dependent
0.10
별
0.09
-induced
0.09
-wise
0.09
влияет
0.09
matters
0.08
affects
0.08
mismatch
0.08
ewise
0.08
Activations Density 0.063%