INDEX
Explanations
text related to paying tribute to individuals or events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.21
0.8%
1618
+0.10
0.4%
994
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
947
+0.21
0.03
1618
+0.10
0.03
1953
+0.09
0.03
Negative Logits
<bos>
-2.37
ⓧ
-0.81
<?
-0.80
/***
-0.76
/**
-0.74
-0.73
Πηγές
-0.71
قایناقلار
-0.71
/*
-0.69
ValueGeneration
-0.68
POSITIVE LOGITS
maneu
1.59
impra
1.46
increa
1.39
reluct
1.34
scrat
1.26
strick
1.25
disagre
1.24
inev
1.24
depic
1.23
unspeak
1.20
Activations Density 0.344%