INDEX
Explanations
facts or events related to specific occurrences, projects or influential individuals
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
596
+0.09
0.3%
976
+0.09
0.3%
1140
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
976
+0.09
0.04
1490
+0.09
0.03
1140
+0.09
0.03
Negative Logits
apprehen
-1.06
disagre
-1.01
maksi
-0.96
unspeak
-0.95
reluct
-0.91
intrigu
-0.86
vainly
-0.86
inappro
-0.85
uninten
-0.84
attemp
-0.83
POSITIVE LOGITS
so
0.83
so
0.75
wieś
0.72
SO
0.68
constate
0.66
@[+][
0.65
astéro
0.63
loài
0.63
Viitteet
0.61
So
0.61
Activations Density 0.075%