INDEX
Explanations
phrases related to making a positive impact or difference in the community
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.11
0.3%
764
+0.09
0.3%
1253
+0.09
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
2016
+0.11
0.05
780
+0.09
0.02
458
+0.09
0.05
Negative Logits
?...
-0.88
quoique
-0.86
fta
-0.84
thut
-0.83
certes
-0.81
!...
-0.80
quelquefois
-0.80
encomp
-0.79
guarante
-0.79
ftu
-0.75
POSITIVE LOGITS
lives
0.69
society
0.64
AssemblyTitle
0.59
communities
0.56
world
0.55
people
0.54
AndEndTag
0.54
mankind
0.53
change
0.51
humanity
0.51
Activations Density 0.431%