INDEX
Explanations
phrases that indicate alternatives or substitutions
New Auto-Interp
Negative Logits
POCH
-0.16
cope
-0.16
dni
-0.15
Bor
-0.15
ogl
-0.15
ittal
-0.14
isu
-0.14
sho
-0.14
din
-0.14
-gap
-0.14
POSITIVE LOGITS
idual
0.15
instead
0.15
ĶåĽŀ
0.15
chaft
0.15
instead
0.15
touches
0.14
poly
0.14
vez
0.14
swick
0.14
holm
0.14
Activations Density 0.014%