INDEX
Explanations
instances of the preposition "to."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
118
+0.14
0.8%
481
+0.13
0.7%
352
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
118
+0.14
0.06
459
+0.13
0.05
505
+0.10
0.05
Negative Logits
eigenvectors
-1.54
himself
-1.53
herself
-1.50
ialize
-1.48
disability
-1.44
prejudice
-1.42
itself
-1.40
odb
-1.40
$(
-1.39
ically
-1.37
POSITIVE LOGITS
®
1.72
¤
1.56
than
1.55
arma
1.54
¿½
1.47
Ļ
1.47
etica
1.47
ints
1.45
aping
1.45
clin
1.39
Activations Density 0.210%