INDEX
Explanations
comparisons related to ease or difficulty
the word "easier" in various contexts
New Auto-Interp
Negative Logits
Saint
-0.74
inals
-0.72
Origin
-0.70
Pall
-0.69
wine
-0.68
eters
-0.66
reen
-0.66
hips
-0.66
alian
-0.65
Nether
-0.64
POSITIVE LOGITS
than
1.03
forgiving
0.96
prey
0.95
compr
0.88
"$:/
0.82
manageable
0.76
readable
0.73
easy
0.72
iquid
0.71
behaved
0.68
Activations Density 0.012%