INDEX
Explanations
proper nouns related to languages and tech companies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1741
+0.15
0.5%
1919
+0.10
0.3%
599
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1919
+0.15
0.07
1510
+0.10
0.05
227
+0.10
0.07
Negative Logits
VIDEOTAPE
-0.72
mitte
-0.66
الرياضيه
-0.63
inimes
-0.63
SharedDtor
-0.63
hial
-0.61
autorytatywna
-0.59
verifyException
-0.58
desertcart
-0.58
Drapeau
-0.58
POSITIVE LOGITS
snoopy
1.14
jurassic
1.04
madonna
1.03
scrat
1.02
kraken
1.01
strick
1.00
ferrari
0.99
impra
0.99
lamborghini
0.98
wikihow
0.98
Activations Density 0.533%