INDEX
Explanations
references to personal experiences and emotions
New Auto-Interp
Negative Logits
chten
-0.15
asa
-0.15
lags
-0.14
ä½ľ
-0.14
ocator
-0.13
rethink
-0.13
èŃ
-0.13
904
-0.13
adan
-0.13
cpy
-0.13
POSITIVE LOGITS
sitting
0.28
walk
0.28
sit
0.28
step
0.27
standing
0.27
stepping
0.25
Walk
0.25
walk
0.25
walked
0.25
walks
0.25
Activations Density 0.258%