INDEX
Explanations
words related to obsession and fixation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1516
+0.11
0.4%
1372
+0.11
0.3%
124
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
124
+0.11
0.03
1372
+0.11
0.03
1363
+0.11
0.03
Negative Logits
makro
-0.52
anonyme
-0.50
solidar
-0.49
boh
-0.47
geograf
-0.47
Völ
-0.47
Bakter
-0.47
dras
-0.47
alpina
-0.47
generali
-0.47
POSITIVE LOGITS
obsession
1.02
obsessed
0.95
obses
0.86
obsessive
0.80
obses
0.79
dirait
0.78
Obs
0.73
fascination
0.69
preoccupation
0.66
pylab
0.63
Activations Density 0.141%