INDEX
Explanations
references to public institutions and policies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
597
+0.11
0.4%
397
+0.11
0.4%
1870
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
599
+0.11
0.04
597
+0.11
0.02
733
+0.11
0.03
Negative Logits
viciss
-0.81
abnorm
-0.81
Mlle
-0.81
franz
-0.74
nephe
-0.72
intest
-0.72
„,
-0.72
territo
-0.71
Membre
-0.69
coö
-0.69
POSITIVE LOGITS
Public
0.91
public
0.88
Public
0.86
PUBLIC
0.76
public
0.75
pública
0.72
públicos
0.68
PUBLIC
0.67
públicas
0.64
público
0.62
Activations Density 0.164%