INDEX
Explanations
the word "Boss" with a strong emphasis
mentions of "Boss" and "Block."
New Auto-Interp
Negative Logits
="#
-0.76
WAYS
-0.72
literacy
-0.71
seekers
-0.69
ebook
-0.66
ibaba
-0.65
cele
-0.65
swipe
-0.65
spir
-0.65
Austral
-0.64
POSITIVE LOGITS
Boss
2.94
Block
1.92
Boss
1.58
Deck
1.48
Mech
1.10
Bugs
1.04
Tank
1.02
Eck
0.95
Studio
0.95
Pos
0.95
Activations Density 0.009%