INDEX
Explanations
topics related to political controversy and social issues
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1150
+0.19
0.6%
1535
+0.15
0.5%
382
+0.14
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.19
0.04
1535
+0.15
0.03
1150
+0.14
0.02
Negative Logits
swarovski
-1.86
ecru
-1.82
lamborghini
-1.76
impra
-1.76
increa
-1.75
embodi
-1.74
hairc
-1.74
eiffel
-1.72
cabrio
-1.71
effe
-1.68
POSITIVE LOGITS
However
0.88
<eos>
0.85
But
0.84
<bos>
0.82
↵↵
0.72
Especies
0.72
However
0.70
IActionResult
0.69
Only
0.69
The
0.68
Activations Density 0.163%