INDEX
Explanations
instructions for cooking a specific dish
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.13
0.4%
876
+0.13
0.4%
121
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
736
+0.13
0.03
166
+0.13
0.03
753
+0.11
0.01
Negative Logits
Wikisource
-0.93
monaster
-0.92
prostitu
-0.77
Kün
-0.76
Contribu
-0.69
Mémoires
-0.69
jurist
-0.68
theolog
-0.68
manuten
-0.68
Ordre
-0.68
POSITIVE LOGITS
oven
0.66
eût
0.64
semblait
0.62
heating
0.62
friable
0.61
heat
0.60
shenan
0.58
ovens
0.58
marié
0.57
heated
0.57
Activations Density 0.133%