INDEX
Explanations
phrases related to preservation and conservation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
411
+0.15
0.5%
528
+0.13
0.5%
596
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
411
+0.15
0.02
1174
+0.13
0.02
596
+0.13
0.02
Negative Logits
Middles
-0.63
impractica
-0.63
Workmen
-0.63
timberland
-0.60
mortgagee
-0.58
trouvera
-0.58
negroes
-0.57
viendra
-0.56
timately
-0.56
unlaw
-0.54
POSITIVE LOGITS
preserve
1.22
preservation
1.18
preserve
1.08
preservation
1.08
preserving
1.07
preserves
1.07
preserved
1.06
Preservation
1.01
preserved
1.01
Preserve
1.00
Activations Density 0.066%