INDEX
Explanations
questions and requests about specific topics or information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
0.7%
394
+0.07
0.3%
86
+0.07
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1617
+0.17
0.05
1784
+0.07
0.06
1253
+0.07
0.03
Negative Logits
<bos>
-2.61
-0.79
/**
-0.76
<?
-0.75
/*
-0.71
HasColumnType
-0.59
<?
-0.58
ⓧ
-0.57
.
-0.56
//---
-0.56
POSITIVE LOGITS
Minang
1.34
maneu
1.31
bandung
1.29
sappi
1.27
pollut
1.25
ecru
1.24
unwarran
1.24
soggior
1.23
impractica
1.22
accla
1.21
Activations Density 0.961%