INDEX
Explanations
references to specific events or things in a detailed text, such as names, locations, and historical information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
313
+0.18
1.0%
528
+0.15
0.9%
1141
+0.15
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1306
+0.18
0.04
1363
+0.15
0.03
313
+0.15
0.03
Negative Logits
<bos>
-1.25
Fordítás
-0.63
września
-0.62
Dlatego
-0.61
Mónica
-0.60
CreateIndex
-0.59
Bardzo
-0.59
Bárbara
-0.59
Darío
-0.59
Belén
-0.58
POSITIVE LOGITS
Ca
1.07
casio
0.93
Ca
0.89
lara
0.89
Calcium
0.88
unce
0.87
roth
0.84
friable
0.84
jessica
0.84
jaya
0.84
Activations Density 0.584%