INDEX
Explanations
web content-related elements and interaction options, such as comments, replies, and engagement prompts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.23
1.1%
2019
+0.14
0.6%
1343
+0.07
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2019
+0.23
0.15
924
+0.14
0.15
1445
+0.07
0.14
Negative Logits
<bos>
-3.28
ⓧ
-1.13
/**
-0.97
<?
-0.92
-0.89
/***
-0.81
ratify
-0.70
springfox
-0.70
/*
-0.68
shivered
-0.67
POSITIVE LOGITS
véhic
0.98
pleins
0.86
milano
0.86
marseille
0.83
bandung
0.83
maroc
0.82
expériment
0.81
Luglio
0.79
multicolore
0.78
soulign
0.78
Activations Density 1.827%