INDEX
Explanations
phrases related to watching or observing, especially in intense or critical situations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1392
+0.09
0.3%
1264
+0.09
0.3%
1828
+0.09
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1392
+0.09
0.04
1264
+0.09
0.03
1828
+0.09
0.03
Negative Logits
konkre
-0.62
benzin
-0.61
hek
-0.59
treff
-0.57
kafe
-0.57
kompres
-0.56
erk
-0.56
garan
-0.56
vermel
-0.55
ekster
-0.55
POSITIVE LOGITS
watching
0.74
watched
0.71
watched
0.70
Watched
0.68
watch
0.68
watches
0.67
watching
0.66
Watched
0.65
contextLoads
0.61
Watching
0.61
Activations Density 0.188%