INDEX
Negative Logits
Magnet
-0.07
brane
-0.07
regul
-0.06
ultur
-0.06
-0.06
اکم
-0.06
tecn
-0.06
.addr
-0.06
insurance
-0.06
cass
-0.06
POSITIVE LOGITS
Boy
0.29
Boy
0.20
boy
0.17
-boy
0.10
Boys
0.10
boys
0.10
boy
0.09
Playboy
0.08
Dictionary
0.07
AY
0.07
Activations Density 0.005%