INDEX
Explanations
adjectives related to negative emotions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
605
+0.10
0.3%
1473
+0.09
0.3%
1622
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
260
+0.10
0.04
1746
+0.09
0.04
1473
+0.08
0.03
Negative Logits
JoinTable
-0.60
ElementRef
-0.58
تقاوى
-0.57
meyi
-0.57
rendono
-0.56
çalves
-0.55
OnDestroy
-0.54
tréal
-0.54
éndez
-0.54
mesini
-0.53
POSITIVE LOGITS
mef
0.92
intersper
0.91
fta
0.88
fuj
0.86
nmax
0.84
levis
0.82
yves
0.81
triton
0.81
alberto
0.79
lts
0.79
Activations Density 0.521%