INDEX
Explanations
references to the city of Austin
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.15
0.9%
241
+0.12
0.7%
1350
+0.11
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
204
+0.15
0.02
1335
+0.12
0.02
241
+0.11
0.02
Negative Logits
<bos>
-2.14
ⓧ
-0.77
více
-0.74
lepší
-0.74
-0.68
/**
-0.67
referenties
-0.66
ViewFeatures
-0.65
հղումներ
-0.63
Při
-0.60
POSITIVE LOGITS
Austin
1.45
Austin
1.36
AUSTIN
1.32
USTIN
1.26
austin
1.25
marte
0.98
Simult
0.95
Mâ
0.95
austin
0.95
Bén
0.95
Activations Density 0.049%