INDEX
Explanations
phrases indicating comparison or contrast
references to alternatives or other choices
New Auto-Interp
Negative Logits
anny
-0.69
onel
-0.67
nutshell
-0.62
enegger
-0.62
utenant
-0.61
ANN
-0.59
Unle
-0.59
¶ħ
-0.58
oker
-0.58
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.58
POSITIVE LOGITS
similarly
1.21
similar
1.09
equally
0.91
likewise
0.89
ypes
0.78
besides
0.78
fared
0.78
Similar
0.73
theirs
0.73
same
0.72
Activations Density 0.504%