INDEX
Explanations
words related to technical issues or problems
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
297
+0.14
0.4%
1288
+0.12
0.4%
2034
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
332
+0.14
0.07
297
+0.12
0.07
990
+0.11
0.06
Negative Logits
vogli
-0.89
sappi
-0.86
voleva
-0.80
ognuno
-0.77
felipe
-0.73
sergio
-0.73
alberto
-0.71
roberto
-0.71
poteva
-0.70
peines
-0.69
POSITIVE LOGITS
queryParams
0.57
gaily
0.56
Transylvania
0.56
nobly
0.55
Nobles
0.54
Frankfort
0.54
Grenville
0.54
חיצוניים
0.53
Broughton
0.53
Whig
0.53
Activations Density 0.495%