INDEX
Negative Logits
jeopard
-0.08
overweight
-0.08
-threat
-0.08
banned
-0.08
BMI
-0.08
ban
-0.08
Ban
-0.08
threatens
-0.07
wat
-0.07
stereotypes
-0.07
POSITIVE LOGITS
guid
0.09
Μέ
0.09
Guid
0.09
.Guid
0.09
개
0.08
માર્ગ
0.08
യൂണ
0.08
μον
0.08
guid
0.08
Κ
0.08
Activations Density 0.018%