INDEX
Explanations
phrases that emphasize care, attention, and pride in one's work or service
New Auto-Interp
Negative Logits
().'/
-0.13
irit
-0.13
ibe
-0.13
positor
-0.13
lak
-0.13
комÑĥ
-0.13
505
-0.13
adh
-0.13
(by
-0.12
ÅĻik
-0.12
POSITIVE LOGITS
into
0.34
pains
0.29
into
0.28
Into
0.28
great
0.27
Into
0.27
_into
0.27
special
0.26
seriously
0.25
INTO
0.24
Activations Density 0.066%