INDEX
Explanations
references to the name "Ig" or words associated with "Ig"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
410
+0.17
1.0%
71
+0.14
0.8%
376
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
410
+0.17
0.01
71
+0.14
0.01
301
+0.14
0.01
Negative Logits
?>
-1.76
thood
-1.65
pite
-1.60
illance
-1.53
neut
-1.52
beating
-1.50
etry
-1.49
FIG
-1.41
aying
-1.38
"?"
-1.37
POSITIVE LOGITS
ILITY
1.60
ģ
1.45
ĻĤ
1.44
gens
1.40
lical
1.39
^**
1.39
)|$(
1.36
(@
1.34
alore
1.32
contexts
1.32
Activations Density 0.007%