INDEX
Explanations
phrases related to socio-economic impacts and effects
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1
+0.08
0.2%
90
+0.07
0.2%
1804
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
871
+0.08
0.03
32
+0.07
0.03
1392
+0.07
0.03
Negative Logits
wattpad
-0.73
disreg
-0.72
bonjour
-0.69
emphat
-0.69
endom
-0.68
inappro
-0.68
compen
-0.67
héro
-0.66
intrigu
-0.66
indestru
-0.65
POSITIVE LOGITS
impact
0.86
effect
0.77
impacts
0.71
effects
0.70
impact
0.69
affect
0.68
implications
0.67
effect
0.66
consequences
0.62
Impact
0.61
Activations Density 0.173%