INDEX
Explanations
contrastive phrases that highlight differences between subjects or scenarios
New Auto-Interp
Negative Logits
Portale
-0.68
ystemet
-0.67
NSCoder
-0.67
zzleHttp
-0.66
************/
-0.65
tartalomajánló
-0.65
bezeichneter
-0.64
CURIAM
-0.63
@"/
-0.63
subconscious
-0.62
POSITIVE LOGITS
Whereas
0.82
Whereas
0.73
Unlike
0.68
unlike
0.67
Unlike
0.63
Whilst
0.59
unlike
0.58
Mentre
0.55
Mientras
0.54
)))));
0.53
Activations Density 0.107%