INDEX
Explanations
references to bathroom-related activities and situations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1065
+0.14
0.5%
61
+0.13
0.4%
1964
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1065
+0.14
0.03
61
+0.13
0.02
492
+0.10
0.02
Negative Logits
strick
-1.23
reluct
-1.20
fuf
-1.20
alre
-1.17
affor
-1.16
accla
-1.15
?...
-1.13
impra
-1.12
encomp
-1.12
inev
-1.11
POSITIVE LOGITS
toilet
1.13
bathroom
1.11
bathrooms
1.00
toilets
1.00
bathroom
0.96
Bathroom
0.91
Toilet
0.86
restroom
0.86
Bathroom
0.83
toilet
0.81
Activations Density 0.101%