INDEX
Explanations
words related to folding or models that are foldable
New Auto-Interp
Negative Logits
asive
-0.17
ÐĬ
-0.16
erap
-0.16
eled
-0.15
ÄĽt
-0.14
eland
-0.14
andler
-0.14
assen
-0.14
éĿ©
-0.14
ples
-0.14
POSITIVE LOGITS
able
0.20
Fold
0.20
backs
0.19
folding
0.18
-fold
0.18
fold
0.17
defaultstate
0.17
æĬĺ
0.17
fold
0.16
ery
0.15
Activations Density 0.019%