INDEX
Explanations
conditional statements beginning with "if" and "even if"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.25
1.1%
890
+0.08
0.4%
1370
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1652
+0.25
0.03
1370
+0.08
0.04
1643
+0.08
0.04
Negative Logits
<bos>
-2.39
/*
-0.78
<?
-0.76
/***
-0.76
-0.69
ⓧ
-0.62
/**
-0.61
AssemblyCompany
-0.61
public
-0.59
///**
-0.57
POSITIVE LOGITS
wien
1.41
ftu
1.39
fup
1.32
perfon
1.31
fta
1.31
thut
1.30
fays
1.27
aen
1.27
maneu
1.27
Manufact
1.26
Activations Density 0.213%