INDEX
Explanations
references to tigers and their habitats
New Auto-Interp
Negative Logits
tridge
-0.16
letcher
-0.16
ategory
-0.15
YTE
-0.15
abilit
-0.15
ibold
-0.15
xea
-0.14
Responder
-0.14
Hodg
-0.14
tes
-0.14
POSITIVE LOGITS
Woods
0.30
cub
0.25
Tiger
0.22
Cub
0.22
Cubs
0.21
Claw
0.21
woods
0.20
claw
0.20
tiger
0.18
Lily
0.18
Activations Density 0.008%