INDEX
Explanations
references to specific cards and strategies used in a competitive card game
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
506
+0.15
0.6%
122
+0.15
0.6%
1837
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
506
+0.15
0.02
1837
+0.15
0.02
156
+0.13
0.02
Negative Logits
Cn
-0.43
Cn
-0.42
censiti
-0.42
rungsseite
-0.42
OGND
-0.41
Datuak
-0.41
namentales
-0.41
GTCX
-0.41
phorbia
-0.40
Hale
-0.40
POSITIVE LOGITS
deck
1.49
Deck
1.44
Deck
1.37
deck
1.36
decks
1.36
DECK
1.26
Decks
1.24
decks
1.06
decking
0.87
デッキ
0.85
Activations Density 0.073%