INDEX
Explanations
terms related to sponsorship and partnerships
New Auto-Interp
Negative Logits
ern
-0.15
ole
-0.15
ourmet
-0.14
gil
-0.14
ugging
-0.14
eker
-0.14
arden
-0.14
umin
-0.14
al
-0.13
ey
-0.13
POSITIVE LOGITS
ships
0.30
ship
0.26
SHIP
0.19
chaft
0.18
ilities
0.17
manship
0.17
iment
0.17
sed
0.16
hips
0.16
aggio
0.15
Activations Density 0.014%