INDEX
Explanations
racial and ethnic terms in a discussion context
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.15
0.5%
521
+0.13
0.4%
1101
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
521
+0.15
0.03
1101
+0.13
0.02
1520
+0.11
0.02
Negative Logits
Nuorodos
-0.60
ApiProperty
-0.55
Lugares
-0.49
itinéraires
-0.47
Referencies
-0.47
Història
-0.47
})+\
-0.46
Bauch
-0.46
Unmount
-0.46
JoinTable
-0.46
POSITIVE LOGITS
Augu
1.15
racial
1.11
Racial
1.06
Mlle
0.96
Telex
0.96
fta
0.95
cartier
0.94
Motos
0.94
ftu
0.94
paff
0.93
Activations Density 0.041%