INDEX
Explanations
mentions of the word "Bear" typically referring to a sports team or a character
mentions of "Bear" and related terms
New Auto-Interp
Negative Logits
ADRA
-0.85
unal
-0.84
ournal
-0.81
anwhile
-0.80
arcer
-0.78
selves
-0.78
istry
-0.76
uries
-0.73
Seym
-0.71
ugu
-0.71
POSITIVE LOGITS
bear
1.03
Bear
0.99
cats
0.97
Bears
0.94
Gry
0.91
cone
0.91
Traps
0.90
chet
0.88
beit
0.84
Trap
0.84
Activations Density 0.015%