INDEX
Explanations
references to various types of worms
references to worms and related terminology
New Auto-Interp
Negative Logits
ctic
-0.82
Palestin
-0.72
culated
-0.72
ioch
-0.71
orically
-0.71
owered
-0.68
++++++++++++++++
-0.67
orie
-0.67
ccording
-0.66
yles
-0.65
POSITIVE LOGITS
hole
1.38
worm
1.27
worms
1.25
holes
1.16
tail
1.09
fish
1.05
worms
1.04
worm
0.95
weed
0.95
larvae
0.91
Activations Density 0.010%