INDEX
Explanations
phrases related to the preservation and documentation of information or evidence
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1173
+0.10
0.3%
490
+0.08
0.2%
382
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1806
+0.10
0.04
1173
+0.08
0.04
257
+0.08
0.04
Negative Logits
kram
-0.68
épu
-0.68
prét
-0.67
!!</
-0.67
habet
-0.65
élar
-0.65
UwU
-0.63
éto
-0.63
quæ
-0.63
rassemble
-0.61
POSITIVE LOGITS
placed
0.74
examined
0.74
kept
0.73
taken
0.71
replaced
0.68
handled
0.68
brought
0.68
analysed
0.68
inspected
0.68
cared
0.68
Activations Density 0.306%