INDEX
Negative Logits
Welfare
-0.09
welfare
-0.09
mirac
-0.08
Steiner
-0.08
Dyn
-0.08
Sper
-0.08
Sussex
-0.07
syrup
-0.07
unst
-0.07
Craig
-0.07
POSITIVE LOGITS
어
0.08
Kil
0.08
azane
0.08
-free
0.08
chic
0.07
-induced
0.07
lz
0.07
db
0.07
�
0.07
일
0.07
Activations Density 0.003%