INDEX
Explanations
quotes and statements attributed to individuals
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1445
+0.12
0.4%
184
+0.11
0.3%
381
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
32
+0.12
0.04
1053
+0.11
0.04
1068
+0.10
0.04
Negative Logits
saurait
-0.79
sentito
-0.73
lapto
-0.72
nomme
-0.68
psychiat
-0.65
agenti
-0.63
anse
-0.63
reger
-0.61
handels
-0.61
répondit
-0.61
POSITIVE LOGITS
unspeak
0.83
mischie
0.73
shenan
0.69
apprehen
0.68
withal
0.67
reconno
0.67
gaily
0.66
pooh
0.65
flattery
0.64
idleness
0.64
Activations Density 0.126%