INDEX
Explanations
phrases related to expressing dissatisfaction or grievances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
899
+0.12
0.4%
1520
+0.11
0.4%
1937
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
899
+0.12
0.03
1861
+0.11
0.03
1520
+0.10
0.03
Negative Logits
apparti
-0.63
afferma
-0.62
ricorda
-0.61
impon
-0.58
prega
-0.58
conosce
-0.58
gioca
-0.56
lancia
-0.56
vernis
-0.55
posa
-0.54
POSITIVE LOGITS
complaint
1.19
complain
1.17
complaints
1.16
complaint
1.06
Complaints
1.05
complaining
1.03
Complaint
1.03
complained
1.02
Complaint
1.02
complains
1.00
Activations Density 0.088%