INDEX
Explanations
words related to eating and food consumption
New Auto-Interp
Negative Logits
']))
-0.93
}}],
-0.86
Prou
-0.80
}));
-0.80
ρίου
-0.79
'])){
-0.78
Himo
-0.77
()))
-0.75
]))
-0.74
())))
-0.71
POSITIVE LOGITS
eat
2.05
EAT
1.79
eats
1.79
eaten
1.78
eating
1.75
Eat
1.74
Eat
1.64
ate
1.63
Eating
1.58
eat
1.56
Activations Density 0.034%