INDEX
Explanations
references to behind the scenes action or events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
757
+0.10
0.3%
674
+0.10
0.3%
994
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
757
+0.10
0.04
1525
+0.10
0.03
900
+0.09
0.03
Negative Logits
unspeak
-1.12
impra
-1.01
withal
-1.01
tolerably
-1.00
apprehen
-0.96
gaily
-0.96
vainly
-0.94
impelled
-0.93
Shakspeare
-0.92
indescri
-0.91
POSITIVE LOGITS
felicità
0.75
pamamagitan
0.74
panahon
0.73
bahay
0.72
frastructure
0.71
behind
0.69
ideolog
0.69
bawat
0.68
behind
0.68
sarili
0.67
Activations Density 0.089%