INDEX
Explanations
contrasting with alternatives
New Auto-Interp
Negative Logits
Various
1.12
Various
1.12
Specific
1.10
Additional
1.05
Additional
1.01
besondere
1.00
Different
0.99
ADDITIONAL
0.98
Different
0.97
Specific
0.94
POSITIVE LOGITS
competitors
1.73
competing
1.65
rival
1.64
rivals
1.61
traditional
1.52
comparable
1.50
competitor
1.47
other
1.43
传统的
1.43
counterparts
1.41
Activations Density 0.212%