INDEX
Explanations
phrases that involve comparisons or equivalences
New Auto-Interp
Negative Logits
)";
-0.72
Wikispecies
-0.69
)");
-0.66
Falla
-0.64
RSSSF
-0.63
íslu
-0.62
();)
-0.61
"""
-0.61
NSCoder
-0.60
::::::::::::::::
-0.60
POSITIVE LOGITS
like
0.92
Like
0.89
LIKE
0.83
Like
0.78
Unlike
0.70
like
0.69
LIKE
0.66
imitate
0.64
Unlike
0.64
seperti
0.63
Activations Density 0.145%