INDEX
Explanations
questions starting with "Why"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.20
1.2%
1276
+0.11
0.6%
122
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1276
+0.20
0.06
397
+0.11
0.05
757
+0.10
0.04
Negative Logits
<bos>
-3.06
ⓧ
-0.93
/***
-0.82
-0.81
/*!
-0.73
rehabilitate
-0.67
/**
-0.66
<?
-0.65
<?
-0.61
lateinit
-0.60
POSITIVE LOGITS
bandung
1.12
eiffel
1.09
Manufact
1.08
why
1.07
WHY
1.06
why
1.05
swarovski
1.05
milano
1.02
WHY
1.01
beverly
1.01
Activations Density 0.120%