INDEX
Explanations
phrases or terms related to organizations or projects
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
453
+0.10
0.3%
1871
+0.10
0.3%
2033
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
453
+0.10
0.06
1831
+0.10
0.03
1471
+0.08
0.04
Negative Logits
Hentet
-0.82
Wiktionnaire
-0.66
Preço
-0.62
Datuak
-0.59
Савезне
-0.58
Hvad
-0.58
Географиясе
-0.57
bibnamefont
-0.57
Varför
-0.56
متعلقه
-0.56
POSITIVE LOGITS
guarante
1.06
emphat
0.94
fuo
0.93
fatis
0.91
effe
0.89
maneu
0.88
Græ
0.88
encomp
0.87
inder
0.86
attemp
0.85
Activations Density 0.292%