INDEX
Explanations
instances of the word "toss" with varying levels of intensity in the text
New Auto-Interp
Negative Logits
KT
-0.68
HCR
-0.64
xia
-0.61
éĸ
-0.61
otype
-0.60
firsthand
-0.58
MAL
-0.58
PRES
-0.58
DEBUG
-0.58
livest
-0.58
POSITIVE LOGITS
toss
1.00
aside
0.97
overboard
0.96
weed
0.87
tossing
0.82
tossed
0.81
bowl
0.80
dab
0.76
stakes
0.76
enger
0.74
Activations Density 0.034%