INDEX
Explanations
mentions of the word "Ram"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
421
+0.15
0.7%
757
+0.14
0.6%
896
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
896
+0.15
0.02
757
+0.14
0.03
966
+0.14
0.02
Negative Logits
Punj
-0.54
briefcase
-0.52
hulk
-0.51
bulkhead
-0.50
awd
-0.46
irited
-0.45
mazda
-0.45
softshell
-0.45
dachshund
-0.44
aspet
-0.43
POSITIVE LOGITS
Ram
1.55
Ram
1.49
ram
1.32
RAM
1.26
Rams
1.21
RAM
1.18
Rams
1.15
ram
1.07
Ramsey
1.04
rams
0.96
Activations Density 0.088%