INDEX
Explanations
words related to analysis and evaluation, such as "ultimately" and "nevertheless."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.14
0.5%
382
+0.14
0.4%
47
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
47
+0.14
0.07
478
+0.14
0.06
1616
+0.12
0.06
Negative Logits
Mockito
-0.58
Contexto
-0.56
Pos
-0.56
Ka
-0.55
Application
-0.55
Ad
-0.54
Var
-0.54
Ar
-0.54
Component
-0.53
It
-0.53
POSITIVE LOGITS
stockholm
1.51
scrat
1.50
michelin
1.47
milf
1.45
eyel
1.44
snoopy
1.42
lidl
1.42
sappi
1.38
guarante
1.37
vhs
1.36
Activations Density 0.441%