INDEX
Explanations
phrases related to food and cooking, as well as criticism or judgmental language
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1535
+0.13
0.5%
1343
+0.12
0.5%
1363
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1363
+0.13
0.07
1343
+0.12
0.09
1013
+0.10
0.09
Negative Logits
<bos>
-2.33
vainly
-1.50
impelled
-1.23
intersper
-1.14
apprehen
-1.13
endeavouring
-1.04
ineffec
-1.02
endeavored
-1.01
triumphantly
-1.00
hastened
-1.00
POSITIVE LOGITS
sappi
1.28
eronau
1.25
capulco
1.22
ristor
1.21
dirond
1.15
monaster
1.15
tramont
1.15
quarelle
1.13
soggior
1.12
pecuni
1.11
Activations Density 0.907%