INDEX
Negative Logits
Difference
-0.65
ammers
-0.64
toggle
-0.63
士
-0.62
oiler
-0.61
haw
-0.59
Panel
-0.58
Burlington
-0.57
Dahl
-0.57
tailed
-0.57
POSITIVE LOGITS
thereto
0.93
ively
0.79
ivity
0.74
thereof
0.71
ãģĨ
0.70
udes
0.69
ngth
0.69
ract
0.68
xual
0.68
teness
0.67
Activations Density 0.021%