INDEX
Explanations
words related to animals
token endings or sentence delimiters
New Auto-Interp
Negative Logits
iferation
-0.81
)=(
-0.77
Interstitial
-0.76
condem
-0.75
oppable
-0.73
Palestin
-0.70
ISTER
-0.69
tert
-0.69
proport
-0.68
mberg
-0.67
POSITIVE LOGITS
fish
1.24
tailed
1.22
ishly
1.12
frog
1.09
bear
1.07
carc
1.01
hawk
1.01
owl
0.98
toe
0.98
hole
0.98
Activations Density 0.148%