INDEX
Explanations
prepositional phrases starting with 'of a'
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.27
0.9%
538
+0.10
0.4%
1964
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1516
+0.27
0.03
1416
+0.10
0.03
538
+0.10
0.03
Negative Logits
Bettina
-0.74
Illus
-0.71
EEU
-0.69
berea
-0.66
miu
-0.66
Tanja
-0.66
Edy
-0.66
cuban
-0.63
Ines
-0.62
shenan
-0.62
POSITIVE LOGITS
<bos>
1.11
kapag
0.63
ształ
0.63
kasama
0.59
itong
0.58
kabel
0.58
siyang
0.56
habang
0.56
meil
0.55
kahit
0.55
Activations Density 0.152%