INDEX
Explanations
comparisons using the word "Like"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
554
+0.14
0.4%
1491
+0.12
0.4%
1828
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
554
+0.14
0.05
1491
+0.12
0.04
892
+0.11
0.05
Negative Logits
teborg
-0.57
protokol
-0.52
balkon
-0.52
alkoh
-0.51
manuten
-0.51
keramik
-0.50
pó
-0.49
asfal
-0.48
jgl
-0.48
szt
-0.48
POSITIVE LOGITS
intersper
0.83
seperti
0.80
like
0.80
Like
0.80
ike
0.78
Like
0.78
like
0.76
LIKE
0.76
LIKE
0.73
shenan
0.72
Activations Density 0.092%