INDEX
Explanations
technical terms and proper nouns from a specific domain or source
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
468
+0.12
0.4%
198
+0.10
0.3%
1553
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
446
+0.12
0.04
198
+0.10
0.04
468
+0.09
0.04
Negative Logits
abstrait
-0.98
exécu
-0.88
alip
-0.86
régulier
-0.83
expéri
-0.81
imprimé
-0.78
fédéral
-0.77
privilégi
-0.77
typique
-0.77
convenable
-0.75
POSITIVE LOGITS
sappi
0.88
useRouter
0.82
createdBy
0.81
parlando
0.77
nemia
0.76
März
0.74
thermomix
0.73
bbene
0.72
eronau
0.72
peculi
0.72
Activations Density 0.220%