INDEX
Explanations
comparative adjectives and phrases expressing similarity
New Auto-Interp
Negative Logits
/change
-0.16
rix
-0.15
Affero
-0.14
RIEND
-0.14
uity
-0.14
ented
-0.14
reon
-0.14
SEP
-0.13
MSN
-0.13
759
-0.13
POSITIVE LOGITS
equally
0.16
ahun
0.16
_DEFINED
0.15
.navigator
0.15
mo
0.14
ployment
0.14
Owner
0.14
gewater
0.14
ioso
0.14
rz
0.14
Activations Density 0.038%