INDEX
Explanations
instructional or procedural language related to customer service and returns
New Auto-Interp
Negative Logits
erring
-0.16
igner
-0.15
Äı
-0.14
conti
-0.14
insula
-0.14
ÏĦολ
-0.13
ehler
-0.13
crater
-0.13
ride
-0.13
rep
-0.13
POSITIVE LOGITS
odash
0.18
451
0.17
orda
0.17
nature
0.16
enda
0.16
nature
0.16
precise
0.15
ilters
0.15
rupa
0.15
izzo
0.14
Activations Density 0.037%