INDEX
Explanations
references to food and dining experiences
New Auto-Interp
Negative Logits
ulo
-0.16
sequ
-0.15
cogn
-0.15
base
-0.14
fn
-0.14
int
-0.13
çŃ
-0.13
бом
-0.13
elow
-0.13
USR
-0.13
POSITIVE LOGITS
takeaway
0.28
take
0.24
carry
0.23
Take
0.23
TAKE
0.23
Carry
0.22
take
0.21
_take
0.21
Take
0.21
carry
0.20
Activations Density 0.091%