INDEX
Explanations
negative contractions indicating inability or denial
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.16
0.8%
78
+0.12
0.6%
381
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
78
+0.16
0.07
478
+0.12
0.06
1491
+0.11
0.06
Negative Logits
<bos>
-3.63
ⓧ
-1.30
<?
-1.19
/**
-1.12
-1.11
/***
-1.05
/*!
-0.86
superintend
-0.85
#![
-0.84
Lma
-0.84
POSITIVE LOGITS
seksi
0.72
;';
0.70
]='\
0.70
]$}
0.69
;;)
0.69
ssaint
0.68
$")
0.68
()")
0.67
/>";
0.67
/>";
0.67
Activations Density 0.177%