INDEX
Explanations
phrases related to a specific product or brand
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1328
+0.14
0.8%
1828
+0.12
0.7%
228
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1328
+0.14
0.03
1331
+0.12
0.02
976
+0.12
0.02
Negative Logits
<bos>
-2.77
InputBorder
-0.79
app
-0.73
sidemargin
-0.70
lateinit
-0.70
ỡng
-0.68
mergeFrom
-0.67
append
-0.67
inform
-0.67
let
-0.66
POSITIVE LOGITS
indestru
1.76
sergio
1.75
ricardo
1.70
stockholm
1.70
depic
1.69
emphat
1.67
affor
1.67
javier
1.65
excru
1.64
reluct
1.63
Activations Density 0.129%