INDEX
Explanations
information related to news articles or reports of events, particularly focusing on quotes and discussions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2019
+0.16
0.5%
1133
+0.11
0.4%
1678
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1133
+0.16
0.04
1895
+0.11
0.03
1678
+0.10
0.03
Negative Logits
hairc
-0.81
velour
-0.75
bouncy
-0.72
snapback
-0.69
tupperware
-0.69
intéressante
-0.69
rtx
-0.67
satchel
-0.66
légiti
-0.65
exceptionnelle
-0.65
POSITIVE LOGITS
{[0.88
_[
0.86
“[
0.81
([
0.81
<bos>
0.81
>[
0.79
//[
0.78
[
0.77
[$
0.77
=[
0.76
Activations Density 0.043%