INDEX
Explanations
phrases related to comparisons and movements between different locations or groups
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1385
+0.17
0.5%
198
+0.11
0.3%
1978
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.17
0.05
1861
+0.11
0.05
1984
+0.09
0.05
Negative Logits
unspeak
-0.76
vainly
-0.74
ineffec
-0.73
shenan
-0.73
apprehen
-0.73
roused
-0.69
gaily
-0.68
jerked
-0.66
grumbled
-0.66
impra
-0.65
POSITIVE LOGITS
destina
0.81
Wikisource
0.79
broder
0.79
branche
0.78
policia
0.76
fras
0.74
polig
0.73
ló
0.73
veda
0.71
vernac
0.71
Activations Density 0.319%