INDEX
Explanations
phrases related to health and exercise
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1437
+0.11
0.4%
1515
+0.10
0.3%
1865
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1437
+0.11
0.03
1351
+0.10
0.02
74
+0.10
0.02
Negative Logits
accla
-1.10
emphat
-1.00
maneu
-0.99
cushi
-0.95
shenan
-0.94
volunte
-0.94
reluct
-0.94
ugg
-0.90
strick
-0.88
encomp
-0.87
POSITIVE LOGITS
gym
1.18
fitness
1.13
workout
0.98
Fitness
0.97
fitness
0.94
gyms
0.94
Gym
0.90
Fitness
0.89
gym
0.86
FITNESS
0.85
Activations Density 0.178%