INDEX
Explanations
exclamatory statements and onomatopoeic expressions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
240
+0.11
0.3%
1691
+0.10
0.3%
468
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.11
0.03
240
+0.10
0.02
577
+0.10
0.02
Negative Logits
Kenmerken
-0.65
Indien
-0.57
Boven
-0.51
mide
-0.49
Flere
-0.49
Hva
-0.48
roda
-0.47
Verder
-0.47
المصادر
-0.47
Mú
-0.47
POSITIVE LOGITS
unden
1.16
excru
1.15
accla
1.14
fatis
1.12
scrat
1.10
lidl
1.07
effe
1.06
volunte
1.06
perfet
1.05
unve
1.05
Activations Density 0.071%