INDEX
Explanations
references to individuals named Josh
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
331
+0.23
1.3%
1416
+0.13
0.7%
966
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
331
+0.23
0.03
1097
+0.13
0.03
1416
+0.13
0.02
Negative Logits
<bos>
-1.33
guma
-0.71
intersper
-0.67
bago
-0.62
pinak
-0.59
Kek
-0.58
Lá
-0.58
inaugurate
-0.58
gani
-0.57
Fé
-0.57
POSITIVE LOGITS
Josh
1.60
Josh
1.53
josh
1.48
Joshua
1.23
Joshua
1.18
josh
1.06
diable
0.72
OSH
0.68
joyeux
0.67
topaz
0.66
Activations Density 0.443%