INDEX
Explanations
the word "just" when it appears in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
376
+0.15
0.8%
241
+0.13
0.7%
91
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
243
+0.15
0.03
241
+0.13
0.03
473
+0.12
0.03
Negative Logits
phrine
-1.57
athe
-1.49
best
-1.44
uclear
-1.43
brightest
-1.39
strongest
-1.37
life
-1.33
worst
-1.31
drugs
-1.29
relief
-1.29
POSITIVE LOGITS
ifi
1.88
ices
1.86
ifiable
1.75
iom
1.71
isman
1.70
isma
1.66
IFY
1.56
ICES
1.54
ice
1.53
ified
1.53
Activations Density 0.143%