INDEX
Explanations
comparisons using the word "like"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
554
+0.15
0.5%
1323
+0.11
0.3%
1096
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
554
+0.15
0.05
1053
+0.11
0.03
892
+0.09
0.04
Negative Logits
Biografía
-0.43
营
-0.42
утбо
-0.42
новништво
-0.42
vulnerables
-0.41
參考文獻
-0.40
OLIC
-0.40
ж
-0.39
Column
-0.39
Kaynakça
-0.39
POSITIVE LOGITS
shenan
0.80
nutella
0.77
indestru
0.76
impra
0.75
reluct
0.75
thermomix
0.74
gild
0.73
tupperware
0.71
Mlle
0.71
viciss
0.71
Activations Density 0.101%