INDEX
Explanations
descriptive details and actions related to a narrative
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1233
+0.10
0.3%
486
+0.10
0.3%
1351
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
254
+0.10
0.02
1032
+0.10
0.02
1619
+0.10
0.02
Negative Logits
Πηγές
-0.66
foque
-0.65
Τι
-0.65
İstinadlar
-0.64
Sqft
-0.64
ModelExpression
-0.63
Filmo
-0.62
kanton
-0.62
ftagPool
-0.61
Walkover
-0.61
POSITIVE LOGITS
reluct
1.54
encomp
1.49
disagre
1.49
maneu
1.48
increa
1.42
guarante
1.42
inev
1.40
shenan
1.38
volunte
1.37
intersper
1.36
Activations Density 0.047%