INDEX
Explanations
references to the second person pronoun "you"
New Auto-Interp
Negative Logits
áf
-0.17
hiro
-0.17
bjerg
-0.14
uard
-0.14
iew
-0.14
ungan
-0.14
probably
-0.13
deliber
-0.13
Lon
-0.13
Gand
-0.13
POSITIVE LOGITS
647
0.17
883
0.16
varsa
0.16
olit
0.15
اض
0.15
ä»Ķ
0.15
ever
0.15
è¿Ļæł·
0.14
hadn
0.14
fos
0.14
Activations Density 0.119%