INDEX
Explanations
mentions of battery-related terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
64
+0.12
0.6%
303
+0.12
0.6%
475
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
223
+0.12
0.02
82
+0.12
0.02
234
+0.11
0.02
Negative Logits
goal
-1.79
preliminary
-1.72
facility
-1.72
ters
-1.69
habitat
-1.60
system
-1.60
mitigation
-1.59
goodbye
-1.56
if
-1.50
management
-1.50
POSITIVE LOGITS
¤
2.67
ĺ
2.61
Ľ
2.58
Ĥ
2.58
ĻĤ
2.46
ĸ
2.45
©
2.39
¸
2.31
±
2.31
Ļª
2.29
Activations Density 0.004%