INDEX
Explanations
occurrences of the word "that" in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
271
+0.13
0.7%
283
+0.13
0.7%
88
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
191
+0.13
0.07
88
+0.13
0.04
335
+0.12
0.06
Negative Logits
continents
-1.80
forty
-1.69
pros
-1.67
earth
-1.64
thirty
-1.62
enth
-1.56
boats
-1.56
except
-1.55
dwell
-1.55
fifty
-1.52
POSITIVE LOGITS
Ĩ
3.36
ij
3.33
¿½
3.25
¡
3.24
¨
3.21
·¸
3.17
½
3.16
µ
3.15
©
3.12
¿
3.10
Activations Density 1.000%