INDEX
Explanations
references to various types of services
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
5
+0.14
0.8%
94
+0.14
0.7%
13
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
250
+0.14
0.04
80
+0.14
0.04
0
+0.11
0.04
Negative Logits
ller
-1.79
aliana
-1.61
istically
-1.59
icable
-1.53
óg
-1.51
izing
-1.50
ually
-1.49
such
-1.48
ally
-1.47
uced
-1.44
POSITIVE LOGITS
documentclass
1.67
srep
1.63
accession
1.60
Parliament
1.52
doctrine
1.52
mine
1.50
bons
1.48
apine
1.47
NATO
1.47
bles
1.46
Activations Density 0.021%