INDEX
Explanations
mentions of the word "pocket."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1233
+0.15
0.6%
687
+0.14
0.6%
1926
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1233
+0.15
0.02
1872
+0.14
0.02
687
+0.12
0.02
Negative Logits
inappro
-0.78
disagre
-0.76
squa
-0.76
unden
-0.74
sii
-0.72
wien
-0.70
desir
-0.70
emphat
-0.70
weber
-0.68
edp
-0.68
POSITIVE LOGITS
1.58
1.45
1.42
1.42
pockets
1.35
Pockets
1.04
wallet
0.93
wallet
0.90
wallets
0.81
ポケット
0.78
Activations Density 0.098%