INDEX
Explanations
references to cause and effect relationships within complex systems or processes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.08
0.2%
1616
+0.07
0.2%
1243
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1243
+0.08
0.07
972
+0.07
0.05
1160
+0.07
0.06
Negative Logits
obligé
-0.74
évident
-0.67
forcé
-0.66
conseillé
-0.65
préférable
-0.64
déterminé
-0.64
Heere
-0.59
tombé
-0.59
compliqué
-0.58
Pende
-0.58
POSITIVE LOGITS
/**
0.75
ⓧ
0.75
/*
0.72
easier
0.66
impossible
0.66
possible
0.65
0.64
<?
0.64
THISDAY
0.61
difficult
0.61
Activations Density 0.314%