INDEX
Explanations
references to religious and moral discussions, possibly related to the Catholic Church
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
509
+0.17
0.5%
1380
+0.12
0.4%
1725
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
509
+0.17
0.09
1380
+0.12
0.02
1725
+0.09
0.02
Negative Logits
Lmfao
-0.68
Lma
-0.67
tricot
-0.66
nutella
-0.65
affez
-0.64
pylab
-0.63
giù
-0.62
prenota
-0.61
Hahah
-0.60
purée
-0.60
POSITIVE LOGITS
Apare
0.43
Tradu
0.43
Cár
0.43
Puebla
0.42
QtWidgets
0.42
Igles
0.42
médicament
0.42
Branches
0.42
Haci
0.41
pú
0.41
Activations Density 0.911%