INDEX
Explanations
phrases related to terms that are 'widely' known or performed
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.08
0.3%
1506
+0.06
0.2%
198
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.08
0.03
446
+0.06
0.03
271
+0.06
0.02
Negative Logits
<bos>
-1.19
-0.97
<?
-0.96
/**
-0.72
<?
-0.69
ⓧ
-0.64
/*
-0.64
don
-0.63
<!--
-0.62
/***
-0.61
POSITIVE LOGITS
Widely
1.85
widely
1.85
stockholm
1.64
affor
1.59
accla
1.57
increa
1.53
nece
1.46
squa
1.45
madonna
1.43
fta
1.42
Activations Density 0.050%