INDEX
Explanations
comparisons and evaluations using the word "like"
phrases that express comparisons or opinions about situations or things
New Auto-Interp
Negative Logits
ascript
-0.89
ashore
-0.82
enhagen
-0.78
auga
-0.76
代
-0.73
enberg
-0.71
milo
-0.71
herer
-0.70
onductor
-0.70
agos
-0.69
POSITIVE LOGITS
quaint
0.85
contradiction
0.83
innocuous
0.78
pmwiki
0.77
oxy
0.77
plausible
0.76
logical
0.75
coincidence
0.74
deviation
0.71
cliché
0.70
Activations Density 0.114%