INDEX
Explanations
adverbs ending in 'ly'
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.11
0.3%
2034
+0.10
0.3%
382
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
163
+0.11
0.02
1473
+0.10
0.02
971
+0.09
0.04
Negative Logits
Glej
-0.79
Související
-0.75
Dijo
-0.69
Zgod
-0.66
Había
-0.65
Wszyst
-0.65
quiler
-0.63
Vlastnosti
-0.61
Wymiary
-0.61
Dě
-0.61
POSITIVE LOGITS
rheumat
0.67
bayern
0.65
€/
0.63
utop
0.63
herbes
0.63
Cartes
0.63
idr
0.63
.
0.62
vivace
0.62
frans
0.61
Activations Density 0.225%