INDEX
Explanations
phrases related to food items, especially sandwiches
New Auto-Interp
Negative Logits
runner
-0.92
fee
-0.83
uno
-0.81
Reply
-0.81
lists
-0.75
jury
-0.74
gin
-0.73
cv
-0.73
uria
-0.71
429
-0.69
POSITIVE LOGITS
wich
1.03
insula
0.94
inant
0.89
berman
0.85
abwe
0.84
awks
0.79
nikov
0.77
reet
0.76
olson
0.75
awk
0.75
Activations Density 6.229%