INDEX
Explanations
descriptive language related to products and services
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.14
0.5%
82
+0.10
0.4%
1013
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1013
+0.14
0.07
1385
+0.10
0.07
1512
+0.10
0.05
Negative Logits
<bos>
-2.34
/***
-0.90
ⓧ
-0.78
Vegeu
-0.65
/**
-0.65
<?
-0.64
-0.62
MockBean
-0.58
/*
-0.56
Тру
-0.55
POSITIVE LOGITS
jaya
1.00
daz
0.94
benzina
0.93
bloss
0.91
bayern
0.90
saad
0.90
meis
0.90
frankfurt
0.90
kyo
0.90
hina
0.89
Activations Density 0.643%