INDEX
Explanations
mentions of the word "ginger" or related terms
New Auto-Interp
Negative Logits
ategory
-0.79
zzi
-0.76
ĵĺ
-0.70
igslist
-0.67
illon
-0.66
rior
-0.65
Shelby
-0.65
iries
-0.65
compan
-0.64
respond
-0.63
POSITIVE LOGITS
bread
1.62
glass
0.96
bats
0.88
prise
0.84
lich
0.83
mond
0.78
laus
0.77
lings
0.76
bang
0.76
hoff
0.74
Activations Density 0.025%