INDEX
Explanations
software-related terms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.17
0.7%
1581
+0.15
0.6%
1705
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1581
+0.17
0.02
1705
+0.15
0.03
489
+0.12
0.02
Negative Logits
accla
-0.67
idolat
-0.67
racon
-0.63
milf
-0.62
nicolas
-0.59
fash
-0.59
hugo
-0.59
disgra
-0.58
fann
-0.57
pamph
-0.57
POSITIVE LOGITS
software
1.37
Software
1.22
Software
1.22
software
1.21
SOFTWARE
1.11
SOFTWARE
0.96
软件
0.95
softwares
0.82
oftware
0.77
logiciel
0.72
Activations Density 0.064%