INDEX
Explanations
mentions of retirement and related terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.18
1.0%
423
+0.15
0.8%
46
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
423
+0.18
0.03
449
+0.15
0.01
46
+0.11
0.02
Negative Logits
©
-2.02
ĨĴ
-1.99
ĥ½
-1.88
Īĺ
-1.86
ĵ
-1.85
Ĵ
-1.83
Ķ
-1.78
į
-1.74
ħ
-1.71
ī
-1.69
POSITIVE LOGITS
junior
1.70
retire
1.62
senior
1.59
prising
1.59
icular
1.54
ftware
1.47
iating
1.45
ktop
1.43
hobby
1.43
iator
1.40
Activations Density 0.169%