INDEX
Explanations
scientific research-related phrases
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
1.2%
699
+0.11
0.6%
805
+0.08
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
699
+0.21
0.04
1490
+0.11
0.03
569
+0.08
0.03
Negative Logits
<bos>
-3.08
/***
-0.96
<?
-0.90
/*!
-0.78
///**
-0.76
-0.75
<!--
-0.64
InjectMocks
-0.64
//---
-0.64
ⓧ
-0.63
POSITIVE LOGITS
lidl
1.28
stockholm
1.28
chrysler
1.28
bandung
1.27
Minang
1.25
maroc
1.23
perfon
1.23
wien
1.22
aen
1.22
lyon
1.18
Activations Density 0.117%