INDEX
Explanations
mentions of food-related words
references to food or food-related topics
New Auto-Interp
Negative Logits
Constantin
-0.73
MIA
-0.72
Ago
-0.66
Anita
-0.65
Canaver
-0.63
chal
-0.63
Khe
-0.62
APE
-0.62
asper
-0.61
Dresden
-0.61
POSITIVE LOGITS
ood
1.14
yssey
1.10
lers
1.04
edly
1.02
les
0.95
ler
0.92
ed
0.90
esley
0.89
ling
0.89
shake
0.88
Activations Density 0.028%