INDEX
Explanations
words related to indigenous cultures or countries
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
544
+0.14
0.5%
528
+0.13
0.5%
900
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
544
+0.14
0.03
1370
+0.13
0.03
78
+0.13
0.02
Negative Logits
moze
-0.59
cheidet
-0.52
jeste
-0.50
zove
-0.48
najbolj
-0.47
Grath
-0.47
)_/¯
-0.46
kobieta
-0.46
RTLE
-0.46
للاسماء
-0.46
POSITIVE LOGITS
Native
1.04
Native
1.03
native
1.02
native
0.93
NATIVE
0.92
fluo
0.86
natives
0.79
Mémoires
0.79
peculi
0.77
Violon
0.77
Activations Density 0.071%