INDEX
Explanations
HTML characters and website links
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1343
+0.32
1.0%
1967
+0.17
0.6%
924
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.32
0.03
1383
+0.17
0.02
924
+0.10
0.02
Negative Logits
यास
-0.56
vodi
-0.53
hésite
-0.49
einf
-0.48
ждествен
-0.47
wird
-0.47
رضا
-0.47
four
-0.47
wurde
-0.46
scienced
-0.46
POSITIVE LOGITS
ciment
0.92
grati
0.92
petto
0.88
ciao
0.87
aquarelle
0.87
marte
0.85
fluo
0.84
toscana
0.83
nutr
0.83
!!</
0.81
Activations Density 0.048%