INDEX
Explanations
occurrences of the phrase "my name is" and variations of introductions
New Auto-Interp
Negative Logits
одо
-0.17
abit
-0.15
립
-0.14
ÄĽl
-0.14
peror
-0.14
ijken
-0.14
arend
-0.14
auer
-0.14
esp
-0.14
нам
-0.14
POSITIVE LOGITS
ifu
0.15
Jaune
0.14
clerosis
0.14
åı«
0.14
withheld
0.14
urgeon
0.14
isposable
0.13
меÑĤ
0.13
_digest
0.13
/Area
0.13
Activations Density 0.017%