INDEX
Explanations
references to steepness or difficulty levels in various contexts
New Auto-Interp
Negative Logits
uv
-0.17
adan
-0.15
ipers
-0.15
rael
-0.15
ims
-0.14
werp
-0.14
IMS
-0.14
wers
-0.14
389
-0.14
itive
-0.14
POSITIVE LOGITS
biology
0.15
$class
0.15
sat
0.14
оÑģÑĤÑĥп
0.14
ôn
0.14
olf
0.14
asel
0.14
ศร
0.14
istically
0.14
_pll
0.14
Activations Density 0.010%