INDEX
Negative Logits
これは
0.49
simply
0.46
아야
0.45
érature
0.41
simplement
0.41
abus
0.40
clearfix
0.40
gewoon
0.39
simplemente
0.39
Simply
0.38
POSITIVE LOGITS
dare
0.85
dared
0.76
Dare
0.70
dares
0.66
Dare
0.66
dare
0.64
敢
0.64
daring
0.63
不敢
0.55
fearless
0.48
Activations Density 0.012%