INDEX
Explanations
words related to different types of soups and broths
references to different types of soup
New Auto-Interp
Negative Logits
yrights
-0.89
tm
-0.86
RL
-0.77
Ul
-0.76
TM
-0.75
MW
-0.74
GC
-0.70
Phant
-0.69
Ul
-0.68
Edge
-0.67
POSITIVE LOGITS
soup
3.92
Soup
2.90
broth
2.04
curry
1.67
stew
1.56
salad
1.52
sou
1.51
spaghetti
1.47
noodles
1.43
pasta
1.39
Activations Density 0.026%