INDEX
Explanations
present tense verbs ending in 'live'
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.12
0.6%
1872
+0.06
0.3%
130
+0.06
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
130
+0.12
0.06
1872
+0.06
0.06
299
+0.06
0.05
Negative Logits
<bos>
-1.80
/***
-0.77
-0.77
/**
-0.74
ⓧ
-0.72
<?
-0.72
/*
-0.71
else
-0.67
hline
-0.66
HasColumnType
-0.66
POSITIVE LOGITS
affor
2.17
maneu
2.14
impra
2.11
increa
2.00
accla
1.90
strick
1.85
emphat
1.85
inev
1.85
reluct
1.81
disagre
1.79
Activations Density 0.098%