INDEX
Explanations
references to physical bearing, weight, or characteristics
instances of the word "bear" in various contexts
New Auto-Interp
Negative Logits
anwhile
-0.78
lectic
-0.72
inx
-0.71
arta
-0.71
enta
-0.70
committee
-0.69
arters
-0.66
phis
-0.66
ugu
-0.66
ablishment
-0.66
POSITIVE LOGITS
bear
1.02
cub
0.88
claws
0.87
bear
0.87
Bears
0.87
Grizz
0.83
bears
0.82
paws
0.82
beit
0.80
paw
0.79
Activations Density 0.011%