INDEX
Explanations
the word "butter"
references to butter
New Auto-Interp
Negative Logits
SPONSORED
-0.77
Citizen
-0.72
vernment
-0.71
Homeless
-0.68
Civic
-0.68
Exile
-0.65
debtor
-0.63
Xi
-0.63
Suns
-0.62
Cheong
-0.62
POSITIVE LOGITS
cream
1.25
butter
1.02
beer
1.00
netflix
0.97
cup
0.95
flies
0.94
boarding
0.94
nesday
0.93
nut
0.91
nuts
0.89
Activations Density 0.008%