INDEX
Explanations
deadlines or time-sensitive words and phrases
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1602
+0.14
0.5%
1870
+0.14
0.5%
1271
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1705
+0.14
0.04
1363
+0.14
0.03
1602
+0.13
0.03
Negative Logits
intersper
-0.82
compréhen
-0.76
confé
-0.74
pixabay
-0.69
Molto
-0.69
shenan
-0.68
accla
-0.67
expéri
-0.67
Châ
-0.67
Molto
-0.66
POSITIVE LOGITS
criteria
0.73
threshold
0.72
criterion
0.63
deadline
0.61
thresholds
0.61
threshold
0.61
guidelines
0.59
ITERIA
0.57
Criteria
0.57
criteria
0.56
Activations Density 0.207%