INDEX
Explanations
mentions or variations of the word "vegan."
instances of the word "vegan"
New Auto-Interp
Negative Logits
handlers
-0.72
showers
-0.71
iewicz
-0.66
shower
-0.61
Parties
-0.60
hanging
-0.60
labels
-0.59
rook
-0.58
hiding
-0.58
artifacts
-0.58
POSITIVE LOGITS
mber
1.38
ctors
1.38
gas
1.32
ggie
1.24
tted
1.22
ctive
1.16
ternity
1.11
gged
1.11
ter
1.11
gan
1.10
Activations Density 0.026%