INDEX
Explanations
phrases indicating competitive advantages or superiority
New Auto-Interp
Negative Logits
urve
-0.15
okes
-0.15
vette
-0.14
venes
-0.14
itta
-0.14
pie
-0.14
หมà¸Ķ
-0.14
odio
-0.14
spender
-0.14
isz
-0.14
POSITIVE LOGITS
jet
0.16
advantage
0.16
uppen
0.16
pong
0.15
747
0.14
Advantage
0.14
chy
0.14
McGregor
0.14
mere
0.14
continental
0.14
Activations Density 0.044%