INDEX
Explanations
references to numerical data or citation metrics in academic contexts
New Auto-Interp
Negative Logits
edo
-0.20
eros
-0.15
oje
-0.15
ught
-0.15
zig
-0.14
/www
-0.14
zier
-0.13
presso
-0.13
ook
-0.12
uesta
-0.12
POSITIVE LOGITS
quat
0.18
uzzi
0.15
enco
0.15
idel
0.14
-inline
0.14
belt
0.14
ovsky
0.14
aller
0.13
loquent
0.13
jad
0.13
Activations Density 0.005%