INDEX
Explanations
the word "bots" at different levels of activation
references to specific toys or themed objects, particularly ones that correlate to popular culture
New Auto-Interp
Negative Logits
intimid
-0.79
acknow
-0.70
livest
-0.69
awa
-0.69
confir
-0.68
prosecut
-0.67
streng
-0.66
Reviewer
-0.66
Granger
-0.65
skelet
-0.65
POSITIVE LOGITS
wana
1.23
anu
1.08
nikov
1.04
onga
0.99
heet
0.98
uba
0.95
assium
0.94
rons
0.94
weet
0.91
aku
0.90
Activations Density 0.012%