INDEX
Explanations
action words related to data analysis and monitoring
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.14
0.5%
468
+0.13
0.4%
678
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
100
+0.14
0.06
1379
+0.13
0.06
678
+0.11
0.07
Negative Logits
perciò
-0.63
InSeconds
-0.60
pertanto
-0.58
raccont
-0.57
Caratteristiche
-0.56
<bos>
-0.55
Література
-0.52
Vedi
-0.51
riguardo
-0.51
nemmeno
-0.50
POSITIVE LOGITS
bandung
1.00
bahay
0.85
susun
0.80
Minang
0.76
jaya
0.75
tanong
0.74
jawa
0.74
labd
0.72
teras
0.72
silang
0.72
Activations Density 0.592%