INDEX
Explanations
the possessive form of words
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
204
+0.14
0.5%
1705
+0.12
0.4%
1839
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
920
+0.14
0.04
1705
+0.12
0.06
204
+0.11
0.04
Negative Logits
solidar
-1.03
akut
-0.93
ideolog
-0.92
textil
-0.83
optik
-0.82
kriminal
-0.82
ortop
-0.80
geolog
-0.79
radikal
-0.79
kosme
-0.78
POSITIVE LOGITS
shewn
0.83
hairc
0.82
gratify
0.79
gaily
0.74
disagre
0.74
quitted
0.73
tolerably
0.73
unspeak
0.72
intersper
0.71
vainly
0.69
Activations Density 0.285%