INDEX
Explanations
adjectives describing a high degree of something
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.23
0.8%
1870
+0.13
0.5%
517
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
478
+0.23
0.03
1776
+0.13
0.02
517
+0.12
0.02
Negative Logits
présidenti
-0.66
abstrait
-0.65
hadur
-0.56
réaliste
-0.55
gyar
-0.53
privilégi
-0.53
typique
-0.53
magique
-0.53
fraz
-0.52
exécu
-0.52
POSITIVE LOGITS
nemia
0.83
postolic
0.82
Ottobre
0.81
thermomix
0.77
Luglio
0.75
psycopg
0.72
highly
0.71
Highly
0.71
highly
0.71
democra
0.70
Activations Density 0.063%