INDEX
Explanations
locations in Asia, specifically mentioning India or China
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1842
+0.16
0.5%
394
+0.11
0.4%
1009
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
227
+0.16
0.10
1842
+0.11
0.05
198
+0.11
0.05
Negative Logits
<bos>
-0.95
gynnwys
-0.69
Paglinawan
-0.68
'\\;'
-0.66
YMS
-0.65
وتسجيلات
-0.64
enderror
-0.63
TagMode
-0.62
abetes
-0.62
actionMode
-0.61
POSITIVE LOGITS
unspeak
2.12
unwarran
2.04
McLaugh
1.99
reluct
1.97
disagre
1.97
apprehen
1.94
affor
1.93
increa
1.92
depic
1.90
inev
1.85
Activations Density 1.085%