INDEX
Explanations
occurrences of the word "Brand" in various contexts
New Auto-Interp
Negative Logits
sav
-0.17
dorf
-0.16
yon
-0.16
iyi
-0.15
utron
-0.15
iyeti
-0.15
ILT
-0.14
IOR
-0.14
seau
-0.14
arty
-0.14
POSITIVE LOGITS
enburg
0.28
ão
0.24
t
0.22
ejs
0.20
strup
0.19
-new
0.18
wine
0.18
ao
0.17
emark
0.17
ts
0.17
Activations Density 0.010%