INDEX
Explanations
adjectives related to physical appearance and personality traits
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1967
+0.15
0.5%
394
+0.11
0.3%
1705
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1597
+0.15
0.03
1363
+0.11
0.04
198
+0.11
0.05
Negative Logits
pessi
-0.69
abbra
-0.65
librement
-0.60
Konkur
-0.60
overla
-0.59
citoy
-0.59
migli
-0.58
apparti
-0.58
Доброго
-0.58
abnorm
-0.57
POSITIVE LOGITS
колеп
0.52
gratify
0.49
беріга
0.47
FBref
0.47
หวัด
0.45
"\<
0.43
pcx
0.43
ACTERS
0.43
teenager
0.43
ness
0.42
Activations Density 0.243%