INDEX
Explanations
pronouns and references to food preparation
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.07
3:0.06
4:0.11
5:0.03
6:0.05
7:0.36
8:0.04
9:0.03
10:0.10
11:0.07
Negative Logits
illance
-1.62
notation
-1.60
ividual
-1.58
vernment
-1.57
ombat
-1.52
rights
-1.50
fax
-1.45
annotation
-1.44
reditation
-1.43
ailability
-1.42
POSITIVE LOGITS
cuc
1.39
uces
1.36
Veg
1.34
Curry
1.34
squash
1.30
curry
1.30
sour
1.30
vodka
1.29
dish
1.26
fres
1.26
Activations Density 0.002%