INDEX
Explanations
references to the Coca-Cola brand
New Auto-Interp
Negative Logits
eldre
-0.07
erset
-0.07
/down
-0.07
uth
-0.07
edly
-0.06
athers
-0.06
omer
-0.06
ce
-0.06
curring
-0.06
eners
-0.06
POSITIVE LOGITS
iel
0.08
izo
0.07
766
0.07
-Cola
0.07
onds
0.07
lish
0.07
thora
0.06
677
0.06
-Identifier
0.06
ëĵĿ
0.06
Activations Density 0.000%