INDEX
Explanations
information related to discussions, analyses, and comparisons between different topics or subjects
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
872
+0.11
0.3%
678
+0.10
0.3%
674
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1450
+0.11
0.03
678
+0.10
0.04
1415
+0.09
0.02
Negative Logits
madonna
-0.68
curé
-0.57
lovel
-0.52
beaute
-0.51
javier
-0.50
ardour
-0.50
chivalry
-0.49
accla
-0.48
parlement
-0.48
?...
-0.48
POSITIVE LOGITS
ourselves
0.70
GEBURTSDATUM
0.67
tayo
0.62
OGND
0.58
pylab
0.57
centralwidget
0.54
tanong
0.53
natin
0.53
onOptions
0.50
poważ
0.50
Activations Density 0.301%