INDEX
Explanations
phrases or words related to offers, services, or promotions
occurrences of the substring "fer."
New Auto-Interp
Negative Logits
oker
-0.62
Decay
-0.61
ooks
-0.61
hift
-0.60
eye
-0.60
IGHTS
-0.59
invent
-0.59
heres
-0.59
arya
-0.58
olved
-0.57
POSITIVE LOGITS
ring
1.13
ior
0.96
ministic
0.95
rer
0.94
andom
0.87
ocious
0.84
rers
0.84
minist
0.82
opol
0.81
dinand
0.81
Activations Density 0.045%