INDEX
Explanations
elements related to attributes and behaviors in technical contexts
New Auto-Interp
Negative Logits
amt
-0.16
OTE
-0.16
ade
-0.16
illard
-0.14
CADE
-0.14
oub
-0.14
edor
-0.14
vyk
-0.14
ROUGH
-0.13
åħ
-0.13
POSITIVE LOGITS
unlike
0.91
Unlike
0.65
Unlike
0.64
whereas
0.52
compared
0.51
Whereas
0.47
contrary
0.41
Compared
0.39
contrast
0.35
contrast
0.33
Activations Density 0.589%