INDEX
Explanations
the word "Canada" or terms related to Canadian institutions and policies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
645
+0.14
0.5%
555
+0.13
0.5%
90
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
645
+0.14
0.05
555
+0.13
0.03
920
+0.13
0.03
Negative Logits
kloped
-0.60
مصادر
-0.54
يتيمه
-0.53
Datuak
-0.51
feira
-0.51
;%%
-0.51
חיצוניים
-0.51
fourths
-0.49
iecie
-0.49
verwijspagina
-0.49
POSITIVE LOGITS
canadian
1.37
Canada
1.31
canada
1.31
Kanada
1.26
Canadians
1.24
Canadian
1.23
CANADA
1.22
Canada
1.20
Canad
1.20
CANADIAN
1.19
Activations Density 0.067%