INDEX
Explanations
descriptions of challenging and rewarding work environments
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.11
0.3%
166
+0.08
0.2%
1614
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1958
+0.11
0.05
1607
+0.08
0.03
1613
+0.07
0.04
Negative Logits
maksi
-1.00
silikon
-0.93
alkoh
-0.91
jaya
-0.89
keramik
-0.89
kosme
-0.86
optik
-0.85
akut
-0.83
makro
-0.82
protokol
-0.79
POSITIVE LOGITS
seeing
0.70
watching
0.63
interacting
0.62
discovering
0.60
knowing
0.58
experiencing
0.57
hearing
0.57
being
0.55
feeling
0.54
doing
0.52
Activations Density 0.494%