INDEX
Explanations
expressions related to frustration or strong emotions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
479
+0.21
1.3%
423
+0.16
1.0%
204
+0.16
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
479
+0.21
0.12
259
+0.16
0.06
231
+0.16
0.13
Negative Logits
sake
-1.60
better
-1.50
longer
-1.49
harmonic
-1.43
foreseeable
-1.39
astom
-1.37
iplex
-1.34
purposes
-1.32
igent
-1.32
sciously
-1.32
POSITIVE LOGITS
§
3.75
¿½
3.67
Ļª
3.63
Ģ
3.48
ĸ
3.46
¦
3.42
ĨĴ
3.42
ª
3.39
Ŀ
3.36
Ļ
3.36
Activations Density 4.058%