INDEX
Explanations
percentage values in various contexts
New Auto-Interp
Negative Logits
erness
-0.15
exus
-0.15
esco
-0.14
pcm
-0.14
eren
-0.14
ares
-0.13
ë£Į
-0.13
.:
-0.13
_prime
-0.13
elines
-0.13
POSITIVE LOGITS
pound
0.18
hell
0.17
Hell
0.16
Hell
0.16
istrov
0.16
zag
0.16
dots
0.14
alf
0.14
hoff
0.14
cult
0.14
Activations Density 0.009%