INDEX
Explanations
adjectives related to admiration or comparison
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1505
+0.10
0.3%
1253
+0.09
0.2%
860
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
100
+0.10
0.04
1505
+0.09
0.04
860
+0.08
0.04
Negative Logits
MathML
-0.61
didn
-0.53
Παραπομπές
-0.50
garant
-0.49
Externí
-0.49
WithFormat
-0.49
nullable
-0.49
unik
-0.49
েল
-0.47
autoIncrement
-0.47
POSITIVE LOGITS
hairc
1.68
milf
1.54
shenan
1.42
depic
1.40
maneu
1.36
swarovski
1.34
madonna
1.34
embodi
1.32
scrat
1.29
squa
1.27
Activations Density 0.268%