INDEX
Explanations
phrases comparing two things, stating that one thing is more like the other
phrases expressing comparisons or analogies
New Auto-Interp
Negative Logits
Sharp
-0.75
Cros
-0.70
Vacc
-0.69
RAW
-0.69
;;;;;;;;;;;;
-0.68
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.62
©¶æ¥µ
-0.61
Stars
-0.60
phia
-0.60
Arch
-0.60
POSITIVE LOGITS
than
1.53
than
1.48
Than
1.12
erous
0.77
oult
0.71
cientious
0.69
rah
0.68
oot
0.68
vention
0.67
ner
0.66
Activations Density 0.152%