INDEX
Explanations
text related to television services and streaming platforms
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
908
+0.11
0.3%
1403
+0.10
0.3%
678
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
678
+0.11
0.05
908
+0.10
0.03
89
+0.10
0.03
Negative Logits
DebuggerStep
-0.57
lefs
-0.56
Prí
-0.55
loài
-0.55
caufe
-0.55
fuper
-0.54
Glej
-0.51
Iné
-0.50
whofe
-0.50
Více
-0.50
POSITIVE LOGITS
pyjama
0.61
subscription
0.59
hairc
0.58
riva
0.58
subscriptions
0.58
hoody
0.58
bayern
0.56
streaming
0.56
fumo
0.56
stik
0.55
Activations Density 0.473%