INDEX
Explanations
occurrences of the word "You" in various forms
New Auto-Interp
Negative Logits
ÑĨи
-0.16
loi
-0.16
ÃŃt
-0.16
ipur
-0.15
allon
-0.14
.Dot
-0.14
ÅĻaz
-0.14
оÑıн
-0.14
idge
-0.14
wright
-0.14
POSITIVE LOGITS
Tube
0.20
Tube
0.19
tube
0.19
ths
0.18
fulness
0.17
AIT
0.17
ermal
0.16
кÑĢаÑĹ
0.15
(th
0.15
tube
0.15
Activations Density 0.070%