INDEX
Explanations
words related to actions or objects involving biting or gnawing
words related to food, particularly snacks or small bites
New Auto-Interp
Negative Logits
Wink
-0.76
Princ
-0.76
tes
-0.76
Goth
-0.71
GBT
-0.70
fitting
-0.69
YL
-0.68
asar
-0.68
Painter
-0.67
Malk
-0.66
POSITIVE LOGITS
hig
0.88
gn
0.83
hors
0.79
nib
0.78
meat
0.77
nood
0.76
scratch
0.75
pri
0.75
agues
0.75
spoon
0.73
Activations Density 0.014%