INDEX
Explanations
verbs in the passive voice
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
674
+0.38
1.5%
764
+0.19
0.7%
609
+0.17
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
599
+0.38
0.03
1
+0.19
0.02
1314
+0.17
0.02
Negative Logits
McLaugh
-1.02
impra
-0.96
hairc
-0.95
unspeak
-0.94
Bartholo
-0.92
Gorb
-0.91
ecru
-0.89
Considerable
-0.89
McInt
-0.88
snoopy
-0.87
POSITIVE LOGITS
<bos>
1.21
ьаж
0.87
HttpNotFound
0.76
autorytatywna
0.75
-------------</
0.75
kasarigan
0.69
fromnode
0.68
RTGC
0.67
ویکیپدی
0.67
IsContent
0.66
Activations Density 0.114%