INDEX
Explanations
words related to sensational or controversial topics
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.10
0.3%
468
+0.08
0.2%
1363
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.10
0.07
1852
+0.08
0.05
850
+0.08
0.05
Negative Logits
unspeak
-1.16
vainly
-1.16
impelled
-1.13
endeavouring
-1.12
wilfully
-1.09
thoughtless
-1.04
ineffec
-1.02
laboring
-0.97
unceasing
-0.94
ineffectual
-0.94
POSITIVE LOGITS
robus
1.14
sappi
1.12
ristor
1.08
pernic
1.07
saluto
1.06
distanciation
1.05
affatto
1.04
pecuni
1.04
solidar
1.04
tramont
1.03
Activations Density 0.238%