INDEX
Explanations
names or terms starting with the letter "C" that often include a person's name or a brand
specific tokens related to a particular category or context, indicated by the pattern of activations
New Auto-Interp
Negative Logits
Chero
-0.68
Tinder
-0.67
GOODMAN
-0.66
uyomi
-0.64
Eston
-0.62
Yamaha
-0.62
Wrestle
-0.61
props
-0.60
opening
-0.60
Rebels
-0.59
POSITIVE LOGITS
overed
1.35
URRENT
1.31
esar
1.29
rossover
1.27
ottage
1.27
auldron
1.25
actus
1.24
abbage
1.23
rouch
1.22
ursor
1.21
Activations Density 0.047%