INDEX
Explanations
proper nouns ending in 'se'
the word "Asian" and its various forms or contexts
New Auto-Interp
Negative Logits
è¦ļéĨĴ
-0.76
owl
-0.69
hawk
-0.69
hoops
-0.67
shorth
-0.65
scheduling
-0.63
wildfire
-0.63
iants
-0.61
liking
-0.61
fences
-0.61
POSITIVE LOGITS
eker
1.27
gger
1.01
venth
1.00
gments
0.97
vere
0.97
ldom
0.92
vier
0.91
ve
0.90
vent
0.89
xy
0.89
Activations Density 0.021%