INDEX
Explanations
expressions of enjoyment and regret related to food consumption
New Auto-Interp
Negative Logits
alink
-0.17
Explicit
-0.16
ustil
-0.15
ameda
-0.15
673
-0.15
seksi
-0.15
гоÑĢод
-0.14
dcc
-0.14
ampo
-0.14
貸
-0.14
POSITIVE LOGITS
consume
0.45
consuming
0.45
consumption
0.44
Consum
0.41
consumed
0.41
consumes
0.40
eat
0.40
eating
0.40
wolf
0.39
scarf
0.38
Activations Density 0.405%