INDEX
Explanations
phrases that express a sense of appreciation or commentary on various situations
New Auto-Interp
Negative Logits
¶
-0.72
76561
-0.68
ILCS
-0.67
é¾įå¥ij士
-0.67
apons
-0.66
Oswald
-0.65
sylvania
-0.64
\<
-0.64
withd
-0.63
Nov
-0.62
POSITIVE LOGITS
coincidence
1.07
wonderful
0.96
bunch
0.93
lot
0.93
lovely
0.93
hypocr
0.87
marvelous
0.86
difference
0.86
heck
0.85
fantastic
0.85
Activations Density 0.009%