INDEX
Explanations
mentions of depicting or representing
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1334
+0.08
0.3%
1101
+0.08
0.3%
200
+0.08
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
647
+0.08
0.02
1141
+0.08
0.02
1101
+0.08
0.02
Negative Logits
<bos>
-0.85
ുറ
-0.67
public
-0.65
multicolumn
-0.62
interface
-0.58
else
-0.58
unsigned
-0.57
int
-0.57
if
-0.57
struct
-0.57
POSITIVE LOGITS
portray
1.95
depic
1.95
portrayal
1.89
depiction
1.81
portraying
1.80
depict
1.75
portrayed
1.75
portrays
1.68
depictions
1.66
depicting
1.64
Activations Density 0.092%