INDEX
Explanations
phrases expressing excessive or overwhelming sentiments
New Auto-Interp
Negative Logits
toch
-0.18
pu
-0.17
map
-0.17
ydı
-0.16
ry
-0.16
st
-0.16
мов
-0.16
wood
-0.15
ÑĢен
-0.15
patch
-0.15
POSITIVE LOGITS
led
0.32
boot
0.30
o
0.26
ledo
0.24
oot
0.24
Boot
0.23
oo
0.22
gether
0.21
much
0.21
boot
0.20
Activations Density 0.022%