INDEX
Explanations
names of cities and dates
occurrences of the letter 'W' in various contexts
New Auto-Interp
Negative Logits
unpre
-0.82
bottleneck
-0.74
Prelude
-0.68
gratification
-0.67
uate
-0.65
Malfoy
-0.62
apprehension
-0.62
sucker
-0.62
Khe
-0.61
İĭ
-0.61
POSITIVE LOGITS
OW
1.25
restling
1.24
ITNESS
1.24
atts
1.22
ITCH
1.22
reck
1.21
alking
1.21
ALK
1.21
orthy
1.20
ORD
1.19
Activations Density 0.036%