INDEX
Explanations
words related to body parts and anatomical terminology
New Auto-Interp
Negative Logits
æ´
-0.17
reach
-0.16
orough
-0.15
agher
-0.15
avin
-0.15
requ
-0.15
ró
-0.15
بر
-0.15
undle
-0.14
axon
-0.14
POSITIVE LOGITS
eer
0.21
i
0.21
er
0.17
apolis
0.17
eu
0.17
yes
0.16
aar
0.16
etes
0.16
esimal
0.15
ic
0.15
Activations Density 0.092%