INDEX
Explanations
text referring to the depiction or description of technological forms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.12
0.4%
699
+0.09
0.3%
11
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
779
+0.12
0.02
1213
+0.09
0.02
270
+0.08
0.02
Negative Logits
ché
-0.99
Chá
-0.95
ù
-0.94
libere
-0.91
parlar
-0.90
sopr
-0.89
dì
-0.87
siff
-0.87
ì
-0.85
kön
-0.85
POSITIVE LOGITS
unspeak
0.96
indescri
0.84
shenan
0.76
apprehen
0.76
horrend
0.74
ineffec
0.69
overcrow
0.63
unimagin
0.63
unbear
0.62
довлет
0.61
Activations Density 0.095%