INDEX
Explanations
abbreviations or acronyms related to specific individuals or organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.27
1.1%
1343
+0.22
0.9%
1150
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.27
0.04
1259
+0.22
0.03
1408
+0.12
0.03
Negative Logits
<bos>
-2.61
ⓧ
-1.26
/***
-0.97
<?
-0.97
/**
-0.80
ProtoMessage
-0.80
solidar
-0.80
/*!
-0.78
///**
-0.77
intptr
-0.76
POSITIVE LOGITS
impra
1.09
soulign
1.07
véhic
1.03
affor
0.99
maneu
0.98
considér
0.98
Manufact
0.96
pleins
0.94
unlaw
0.92
Byp
0.92
Activations Density 0.022%