INDEX
Explanations
mentions of wages and minimum wage issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
219
+0.16
0.9%
376
+0.14
0.8%
250
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
219
+0.16
0.02
167
+0.14
0.02
85
+0.12
0.02
Negative Logits
uke
-1.53
FIG
-1.45
mention
-1.40
\)
-1.35
Cause
-1.34
interest
-1.32
Sea
-1.29
Empire
-1.29
usions
-1.27
allel
-1.26
POSITIVE LOGITS
wright
1.79
ENTIAL
1.72
"}](#
1.67
ably
1.55
etable
1.48
kowski
1.48
haus
1.47
epi
1.45
suffers
1.45
schemas
1.44
Activations Density 0.019%