INDEX
Explanations
mentions of the word "bear" and related terms, including the football team, the animal, and figurative uses
New Auto-Interp
Negative Logits
istry
-0.77
lectic
-0.73
ADRA
-0.72
selves
-0.70
Gutenberg
-0.68
opus
-0.67
ablishment
-0.66
ubuntu
-0.65
isters
-0.65
arters
-0.65
POSITIVE LOGITS
cub
0.97
bear
0.90
Bears
0.87
hug
0.87
xual
0.84
Grizz
0.81
hugs
0.78
beit
0.78
claws
0.77
cats
0.74
Activations Density 0.029%