INDEX
Explanations
the word "useful" in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
0.9%
47
+0.13
0.6%
168
+0.12
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
47
+0.17
0.03
168
+0.13
0.04
1271
+0.12
0.03
Negative Logits
<bos>
-3.12
/***
-0.81
-0.81
ⓧ
-0.80
/**
-0.79
mount
-0.65
},{
-0.64
/*
-0.63
Более
-0.63
})();
-0.61
POSITIVE LOGITS
ankara
1.47
milano
1.46
maroc
1.41
italia
1.40
wien
1.39
fers
1.37
eiffel
1.36
ibiza
1.35
stockholm
1.35
riviera
1.34
Activations Density 0.201%