INDEX
Explanations
details related to diet and food preferences
New Auto-Interp
Negative Logits
alcohol
-0.67
cognac
-0.65
coffee
-0.63
Kopi
-0.63
whiskey
-0.62
Alcohol
-0.62
whisky
-0.61
AddTagHelper
-0.60
coffee
-0.60
alcoholic
-0.60
POSITIVE LOGITS
carrots
0.89
Carrots
0.87
cabbage
0.85
vegetables
0.82
carrot
0.80
vegetable
0.80
turnips
0.80
Cabbage
0.79
greens
0.78
Carrot
0.76
Activations Density 0.291%