INDEX
Explanations
prices or quantities of specific fruits
mentions of the fruit "apple."
New Auto-Interp
Negative Logits
interrupted
-0.83
uled
-0.76
efeated
-0.72
ocumented
-0.71
emporary
-0.67
citizens
-0.66
Tsarnaev
-0.65
Gazette
-0.64
gettable
-0.64
Roose
-0.64
POSITIVE LOGITS
apple
1.40
cider
1.33
apples
1.31
apple
1.10
fruit
1.05
juice
1.01
baum
0.97
pie
0.93
pies
0.92
jack
0.90
Activations Density 0.015%