INDEX
Explanations
references to physical aspects or attributes
New Auto-Interp
Negative Logits
aca
-0.18
avel
-0.18
eday
-0.15
aren
-0.15
Stub
-0.15
uyết
-0.14
lid
-0.14
.functional
-0.14
dex
-0.14
option
-0.13
POSITIVE LOGITS
ity
0.27
mente
0.25
physical
0.22
ities
0.21
s
0.21
physically
0.19
ized
0.19
ITY
0.19
Physical
0.18
dehyde
0.17
Activations Density 0.030%