INDEX
Explanations
references to the brand "Polo" and related high-fashion terminology
New Auto-Interp
Negative Logits
aina
-0.15
çĩ
-0.14
cous
-0.14
quat
-0.14
martial
-0.14
ìĦĿ
-0.14
Serif
-0.14
okie
-0.14
dance
-0.14
Ninja
-0.14
POSITIVE LOGITS
polo
0.47
Polo
0.41
pol
0.28
pon
0.25
horses
0.25
pony
0.24
pol
0.23
horse
0.23
Pol
0.23
POL
0.22
Activations Density 0.001%