INDEX
Explanations
words related to accidents, injuries, and emergencies
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1216
+0.07
0.2%
674
+0.07
0.2%
630
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
946
+0.07
0.04
1493
+0.07
0.02
736
+0.07
0.04
Negative Logits
unie
-0.55
jws
-0.53
webElementGuid
-0.51
rente
-0.50
Dokter
-0.49
petito
-0.49
FFIX
-0.48
spis
-0.47
CopyWith
-0.47
corrom
-0.47
POSITIVE LOGITS
newArr
0.86
jorge
0.81
ftu
0.80
pymongo
0.78
sergio
0.75
yves
0.73
pymysql
0.70
Lmao
0.70
pylab
0.69
roberto
0.69
Activations Density 0.186%