INDEX
Explanations
words related to wildlife conservation, especially focusing on rhinos and elephants
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1220
+0.13
0.4%
2016
+0.11
0.4%
228
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1220
+0.13
0.06
802
+0.11
0.05
1597
+0.11
0.03
Negative Logits
hek
-1.16
kac
-1.07
minimalis
-1.00
adal
-0.99
buk
-0.91
straff
-0.91
katal
-0.89
broder
-0.88
Fichier
-0.88
makro
-0.88
POSITIVE LOGITS
unspeak
0.98
philosophic
0.92
liberality
0.92
ingrat
0.92
earnestness
0.88
indestru
0.86
withal
0.84
veneration
0.81
despotism
0.79
obstin
0.79
Activations Density 0.253%