INDEX
Explanations
mentions of the word "card"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1491
+0.14
0.6%
1124
+0.13
0.5%
1416
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1059
+0.14
0.03
1124
+0.13
0.03
1491
+0.13
0.03
Negative Logits
maxime
-0.53
ougars
-0.49
Tomé
-0.48
vogliamo
-0.47
libere
-0.47
groupName
-0.47
vorrei
-0.47
pageNum
-0.45
كومونز
-0.44
pertanto
-0.44
POSITIVE LOGITS
card
1.64
card
1.50
cards
1.47
Card
1.46
Card
1.37
Cards
1.36
cards
1.34
CARD
1.31
CARD
1.30
Cards
1.22
Activations Density 0.085%