INDEX
Explanations
information related to public criticism or controversial actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.13
0.4%
144
+0.13
0.4%
776
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
144
+0.13
0.05
1081
+0.13
0.06
1464
+0.12
0.05
Negative Logits
unspeak
-0.69
vexed
-0.52
indescri
-0.51
Bibliographie
-0.50
Warszawie
-0.49
viszont
-0.49
phénomènes
-0.49
impelled
-0.47
Normdatei
-0.47
Wirklichkeit
-0.46
POSITIVE LOGITS
utop
1.06
incess
1.01
majest
0.94
sappi
0.93
priva
0.88
palet
0.88
ordina
0.88
monaster
0.85
beren
0.83
solidar
0.82
Activations Density 0.253%