INDEX
Explanations
words related to pollution and environmental issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2034
+0.15
0.5%
1819
+0.14
0.4%
764
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.15
0.11
1013
+0.14
0.09
1819
+0.12
0.09
Negative Logits
purcha
-1.49
emphat
-1.49
affor
-1.48
increa
-1.48
reluct
-1.46
fuf
-1.46
encomp
-1.40
desir
-1.36
depic
-1.36
volunte
-1.35
POSITIVE LOGITS
.
0.90
.*;
0.71
;
0.69
]]:
0.69
TimeoutException
0.68
.;
0.68
}.
0.67
.}
0.67
().
0.66
!
0.66
Activations Density 0.805%