INDEX
Explanations
phrases indicating concern or care for others
New Auto-Interp
Negative Logits
viso
-0.16
ÑĨин
-0.16
uga
-0.16
ummer
-0.15
esor
-0.15
651
-0.15
Crown
-0.14
ovich
-0.14
alian
-0.14
ưá»Ŀi
-0.14
POSITIVE LOGITS
rieb
0.15
ÙĤب
0.15
lied
0.15
pipeline
0.14
and
0.14
Banc
0.14
conduit
0.14
æľºåħ³
0.14
Lap
0.14
squ
0.14
Activations Density 0.015%