INDEX
Explanations
adjectives ending in 'y' with a positive connotation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1350
+0.19
0.7%
1978
+0.12
0.4%
866
+0.10
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1350
+0.19
0.05
260
+0.12
0.04
1363
+0.10
0.04
Negative Logits
🕗
-0.81
Doğ
-0.71
Ağ
-0.69
alnız
-0.67
الحره
-0.66
accéder
-0.65
simplif
-0.64
település
-0.64
DataPropertyName
-0.62
ędzynarod
-0.61
POSITIVE LOGITS
carina
0.67
spacy
0.64
sneaky
0.62
y
0.61
uzzy
0.61
licky
0.60
bumpy
0.60
galleria
0.60
vecchia
0.58
rainy
0.58
Activations Density 0.292%