INDEX
Explanations
references to overcoming physical challenges and recovery from injuries
New Auto-Interp
Negative Logits
ród
-0.15
tail
-0.15
ube
-0.14
etri
-0.14
ash
-0.14
adia
-0.14
vil
-0.13
Respect
-0.13
noses
-0.13
Circuit
-0.13
POSITIVE LOGITS
mobility
0.33
Mobility
0.30
walks
0.27
independence
0.27
walking
0.27
walker
0.27
independ
0.27
mob
0.25
walk
0.25
independently
0.24
Activations Density 0.124%