INDEX
Explanations
phrases or words related to representing or representation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
1.0%
1491
+0.11
0.6%
521
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1101
+0.18
0.03
1096
+0.11
0.03
1491
+0.11
0.03
Negative Logits
<bos>
-3.00
-0.85
/***
-0.84
ⓧ
-0.82
<?
-0.73
/*!
-0.71
<?
-0.66
///**
-0.61
#![
-0.60
/*
-0.60
POSITIVE LOGITS
Represent
1.15
represen
1.06
Representation
1.06
effe
1.02
representation
1.02
ftu
1.01
Representing
1.00
Represent
0.99
aen
0.99
fep
0.99
Activations Density 0.128%