INDEX
Explanations
legal case references in specific citation formats
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.20
0.7%
1343
+0.11
0.4%
453
+0.06
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
485
+0.20
0.03
1714
+0.11
0.03
286
+0.06
0.02
Negative Logits
<bos>
-2.14
ⓧ
-1.00
-0.97
/*
-0.91
/**
-0.89
<?
-0.89
/***
-0.82
quitted
-0.82
<?
-0.78
inaugurate
-0.63
POSITIVE LOGITS
corrom
0.72
bandung
0.70
Trasp
0.67
catég
0.67
marea
0.64
cuit
0.64
ados
0.64
riva
0.64
vano
0.62
mezza
0.61
Activations Density 0.028%