INDEX
Explanations
references to high costs or expensive items
New Auto-Interp
Negative Logits
ixin
-0.16
lide
-0.16
hod
-0.15
tery
-0.14
usz
-0.14
otherwise
-0.14
ao
-0.13
quot
-0.13
inger
-0.13
conda
-0.13
POSITIVE LOGITS
SPA
0.18
elter
0.15
šak
0.15
elper
0.15
ERV
0.15
çak
0.14
edom
0.14
.strict
0.14
endon
0.14
ARRIER
0.14
Activations Density 0.014%