INDEX
Explanations
results of experiments and studies, particularly related to social behavior and psychology
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
690
+0.11
0.3%
1535
+0.10
0.3%
2034
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1021
+0.11
0.04
1499
+0.10
0.04
1208
+0.09
0.03
Negative Logits
hairc
-0.93
tupperware
-0.92
amigurumi
-0.91
riviera
-0.91
murano
-0.89
michelin
-0.83
idolat
-0.83
nutella
-0.82
wretch
-0.82
medusa
-0.81
POSITIVE LOGITS
Results
0.94
Results
0.90
RESULTS
0.86
results
0.82
***!
0.80
results
0.76
RESULTS
0.73
BeginInit
0.72
Findings
0.70
findings
0.68
Activations Density 0.289%