INDEX
Explanations
references to academic research studies and findings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1586
+0.09
0.3%
872
+0.09
0.3%
919
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1586
+0.09
0.04
919
+0.09
0.02
69
+0.08
0.02
Negative Logits
chré
-0.73
eiffel
-0.69
Vaugh
-0.66
tricot
-0.64
religieuses
-0.64
spirituale
-0.64
eccl
-0.63
cristi
-0.63
veneta
-0.63
umbro
-0.62
POSITIVE LOGITS
studies
0.76
evidence
0.70
research
0.68
studies
0.63
evidence
0.59
researched
0.59
Studies
0.58
scientifically
0.56
study
0.55
research
0.54
Activations Density 0.317%