INDEX
Explanations
promoting justice and a better future
New Auto-Interp
Negative Logits
Productivity
0.46
career
0.43
Carrick
0.43
Career
0.42
productividad
0.42
Performance
0.42
Career
0.42
करियर
0.41
productivity
0.41
Output
0.40
POSITIVE LOGITS
justice
1.01
justice
0.80
justicia
0.77
fairer
0.76
equitable
0.75
safer
0.74
justiça
0.73
happier
0.72
better
0.71
truly
0.71
Activations Density 0.009%