INDEX
Explanations
phrases indicating a call to action or a sense of duty
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
90
+0.09
0.3%
1978
+0.08
0.2%
1987
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
90
+0.09
0.03
261
+0.08
0.01
1664
+0.07
0.02
Negative Logits
intrigu
-0.57
endeav
-0.52
endeavouring
-0.52
apprehen
-0.51
🕗
-0.50
Varan
-0.49
AMAZ
-0.49
impelled
-0.49
vainly
-0.48
ineffec
-0.48
POSITIVE LOGITS
igneur
0.63
prostitu
0.63
pulsante
0.61
utop
0.58
notor
0.58
hunde
0.57
pantal
0.57
ideolog
0.56
matel
0.56
meras
0.55
Activations Density 0.105%