INDEX
Explanations
words related to body parts
abbreviations or shorthand representations of common terms
New Auto-Interp
Negative Logits
newsp
-0.70
reluct
-0.69
citiz
-0.67
elig
-0.66
Debor
-0.61
seiz
-0.60
furious
-0.59
godd
-0.59
paran
-0.57
infuri
-0.57
POSITIVE LOGITS
worm
1.37
less
1.24
worms
1.21
breaker
1.18
pipe
1.11
ful
1.07
heads
1.05
hole
1.05
hunter
1.05
fish
1.05
Activations Density 0.180%