INDEX
Explanations
references to social status and hierarchies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1618
+0.14
0.6%
1306
+0.14
0.5%
1387
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1306
+0.14
0.03
1618
+0.14
0.02
1387
+0.11
0.02
Negative Logits
naran
-0.58
Joaqu
-0.53
Punj
-0.52
capitan
-0.51
cciale
-0.50
decret
-0.50
schi
-0.49
ceptives
-0.49
pinak
-0.49
icleta
-0.47
POSITIVE LOGITS
status
1.42
Status
1.35
status
1.30
STATUS
1.25
statuses
1.20
Status
1.19
getStatus
1.15
STATUS
1.13
setStatus
1.13
getStatus
1.11
Activations Density 0.072%