INDEX
Explanations
references to physical weight and lifting activities
New Auto-Interp
Negative Logits
esso
-0.16
raham
-0.14
éĿ¢
-0.14
otten
-0.14
inois
-0.13
ucci
-0.13
arkin
-0.13
Leo
-0.13
ürn
-0.13
Twig
-0.13
POSITIVE LOGITS
Dob
0.16
ogle
0.16
ÏĢά
0.14
ocab
0.14
_ALIGN
0.14
irrational
0.14
tas
0.14
olian
0.14
quisitions
0.14
lw
0.14
Activations Density 0.213%