INDEX
Explanations
technical and scientific terms in a research context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
872
+0.19
0.7%
599
+0.16
0.5%
50
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1380
+0.19
0.04
872
+0.16
0.13
599
+0.13
0.10
Negative Logits
gesta
-0.90
Transfermarkt
-0.81
NKC
-0.77
notor
-0.76
cera
-0.76
interag
-0.75
Senat
-0.74
atle
-0.74
Distrikt
-0.74
bronz
-0.73
POSITIVE LOGITS
shenan
1.13
disreg
1.11
wikihow
1.09
milf
1.08
lmfao
1.04
mcdonald
1.04
snoopy
1.01
Considerable
1.01
embodi
1.01
🤣🤣
1.00
Activations Density 1.939%