INDEX
Explanations
instances of the word "whose."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.16
0.9%
188
+0.13
0.7%
4
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
4
+0.16
0.03
188
+0.13
0.03
170
+0.13
0.03
Negative Logits
thems
-1.49
mine
-1.47
Category
-1.43
:`
-1.38
represented
-1.36
shire
-1.35
:]
-1.34
considering
-1.34
gov
-1.33
CMS
-1.31
POSITIVE LOGITS
Į
3.02
ª
2.90
Ļ
2.82
İ
2.81
³
2.70
·¸
2.69
¨
2.69
¥
2.68
¸
2.67
¶
2.66
Activations Density 0.223%