INDEX
Negative Logits
DOT
-0.79
dot
-0.78
-0.72
क्रम
-0.71
طرف
-0.70
romas
-0.69
gnose
-0.69
μην
-0.69
sie
-0.68
呕
-0.68
POSITIVE LOGITS
hook
2.41
hooks
2.36
use
2.34
hooks
2.34
Hooks
2.14
Hook
2.13
Hooks
2.08
Hook
2.02
hook
1.98
use
1.89
Activations Density 0.025%