INDEX
Explanations
comparisons using the phrase "just as."
New Auto-Interp
Negative Logits
anca
-0.20
ạm
-0.18
asma
-0.17
loub
-0.17
kaar
-0.14
rias
-0.14
thus
-0.14
uhl
-0.14
.fhir
-0.14
iou
-0.14
POSITIVE LOGITS
arily
0.15
/cop
0.15
zeitig
0.14
CI
0.14
Ord
0.13
ty
0.13
aru
0.13
ufe
0.13
eru
0.13
anger
0.13
Activations Density 0.020%