INDEX
Explanations
text related to balance, whether it's a balanced view, explanation, or diet
expressions related to the concept of balance
New Auto-Interp
Negative Logits
olog
-0.79
clips
-0.78
OLOG
-0.76
ABE
-0.73
ography
-0.72
ACH
-0.72
Offic
-0.72
chief
-0.71
Nazi
-0.70
Tat
-0.69
POSITIVE LOGITS
balanced
1.28
balanced
1.23
imbalance
1.12
Balanced
0.96
balancing
0.96
balance
0.92
balance
0.92
parity
0.90
balances
0.89
neutrality
0.82
Activations Density 0.008%