INDEX
Explanations
requests or mentions made in a discreet or stealthy manner
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.16
0.6%
1437
+0.10
0.4%
1443
+0.07
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1443
+0.16
0.04
946
+0.10
0.05
1930
+0.07
0.03
Negative Logits
<bos>
-2.27
/*!
-0.67
/***
-0.65
/*++
-0.61
//{
-0.57
***************/
-0.56
displayquote
-0.55
//*/
-0.53
/**
-0.53
ఐ
-0.53
POSITIVE LOGITS
frankfurt
1.02
stockholm
1.01
squa
1.00
ricardo
0.99
stefan
0.98
Minang
0.98
seoul
0.96
arture
0.95
tramont
0.95
jorge
0.94
Activations Density 0.545%