INDEX
Negative Logits
belong
-0.80
belongs
-0.76
separat
-0.68
Goodbye
-0.66
Badge
-0.65
izable
-0.65
Stability
-0.65
士
-0.65
Carry
-0.64
Tags
-0.64
POSITIVE LOGITS
able
1.34
unable
1.26
surprised
1.20
hesitant
1.19
reluctant
1.18
alerted
1.15
unaware
1.14
aware
1.13
astonished
1.11
pleased
1.11
Activations Density 0.271%