INDEX
Explanations
proper nouns or names
the presence of the suffix "st" in various contexts
New Auto-Interp
Negative Logits
thumbs
-0.66
disabling
-0.66
deaf
-0.62
amy
-0.60
timely
-0.59
calming
-0.58
Haram
-0.58
Bills
-0.55
Peel
-0.55
crest
-0.55
POSITIVE LOGITS
oppers
1.19
retch
1.17
oppable
1.16
itute
1.14
itution
1.13
hetics
1.10
ructure
1.08
alk
1.07
rikes
1.07
rict
1.06
Activations Density 0.041%