INDEX
Explanations
instances of the word "promote" or its variations, indicating a focus on promotional activities or initiatives
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.21
1.2%
376
+0.21
1.2%
408
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
408
+0.21
0.02
472
+0.21
0.02
300
+0.13
0.02
Negative Logits
ardi
-1.72
ween
-1.68
ajan
-1.61
sel
-1.57
etc
-1.52
nio
-1.48
til
-1.48
shire
-1.47
recently
-1.46
enez
-1.44
POSITIVE LOGITS
ĻĤ
2.72
ł
2.70
ĥ
2.62
¼
2.56
Ĭ
2.52
Ĩ
2.46
Īĺ
2.44
¤
2.44
³
2.36
¿
2.32
Activations Density 0.032%