INDEX
Explanations
bug bounty and financial rewards
New Auto-Interp
Negative Logits
层
0.46
层的
0.45
ẫu
0.44
意义
0.44
manın
0.43
frequency
0.41
kaleidoscopic
0.41
ölt
0.41
overuse
0.41
antal
0.40
POSITIVE LOGITS
incentives
1.05
incentive
1.05
incentivize
1.05
reward
1.03
rewards
1.02
incentiv
1.01
Incent
0.98
incent
0.98
rewarded
0.95
Incentive
0.94
Activations Density 0.049%