INDEX
Explanations
elements of html/css code related to text formatting
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.18
0.5%
876
+0.14
0.4%
1385
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.18
0.02
876
+0.14
0.00
1912
+0.10
0.01
Negative Logits
endeavouring
-0.83
unspeak
-0.81
intrigu
-0.81
exasper
-0.78
apprehen
-0.78
impelled
-0.76
obstinate
-0.75
vexed
-0.74
labouring
-0.73
endeav
-0.72
POSITIVE LOGITS
affez
1.24
merav
0.97
soggior
0.95
Décembre
0.95
fatte
0.94
ristor
0.92
preghi
0.90
rimb
0.90
rilass
0.90
frasi
0.90
Activations Density 0.047%