INDEX
Explanations
mentions related to workplace policies and employee benefits
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1533
+0.14
0.4%
509
+0.10
0.3%
1823
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1533
+0.14
0.02
1366
+0.10
0.04
1823
+0.09
0.03
Negative Logits
stratigraph
-0.88
Kün
-0.84
embodi
-0.70
unden
-0.70
entibus
-0.67
Bartholo
-0.67
glau
-0.66
arxiv
-0.66
unlaw
-0.65
ferru
-0.64
POSITIVE LOGITS
<bos>
0.96
parenting
0.91
childcare
0.78
Parenting
0.72
'\\;'
0.69
parenthood
0.65
motherhood
0.65
parents
0.65
parenting
0.64
parental
0.62
Activations Density 0.448%