INDEX
Explanations
instances of numerical references or citations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
469
+0.12
0.7%
222
+0.11
0.6%
455
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
469
+0.12
0.03
50
+0.11
0.02
316
+0.11
0.03
Negative Logits
á̝
-1.59
ocamp
-1.52
felony
-1.51
misdemeanor
-1.49
remedies
-1.49
ocument
-1.44
urel
-1.43
theories
-1.43
adem
-1.43
ienna
-1.38
POSITIVE LOGITS
lan
1.65
Ltd
1.53
HL
1.49
League
1.45
locks
1.38
Gur
1.37
fan
1.34
lap
1.34
occupied
1.33
Aviv
1.31
Activations Density 0.118%