INDEX
Explanations
expressions related to comparisons, often emphasizing the significant or extreme nature of a situation or action
phrases indicating a comparative reduction or negation
New Auto-Interp
Negative Logits
ãĥ¼ãĥĨãĤ£
-0.66
Net
-0.66
unic
-0.65
Kau
-0.63
Ö¼
-0.62
imov
-0.62
Morrow
-0.62
Sky
-0.62
okin
-0.61
ULT
-0.61
POSITIVE LOGITS
than
0.81
natureconservancy
0.76
nces
0.75
eloqu
0.74
honored
0.70
aneously
0.67
honoured
0.67
forgiving
0.67
tast
0.66
generous
0.64
Activations Density 0.015%