INDEX
Explanations
references to wrestling and combat sports
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
128
+0.13
0.7%
424
+0.12
0.7%
460
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
460
+0.13
0.01
128
+0.12
0.01
429
+0.12
0.01
Negative Logits
ago
-1.73
zes
-1.51
aging
-1.49
orie
-1.48
utt
-1.47
reviewing
-1.40
bowel
-1.40
anus
-1.40
quiry
-1.38
agers
-1.36
POSITIVE LOGITS
nuts
2.06
doms
1.95
)|$(
1.75
hurst
1.68
nut
1.63
place
1.58
pole
1.55
gered
1.53
à°¿
1.51
temples
1.51
Activations Density 0.025%