INDEX
Explanations
phrases related to movement or direction
instances of the word "back" in various contexts
New Auto-Interp
Negative Logits
aucus
-0.59
anto
-0.58
Disclosure
-0.57
delegation
-0.56
iston
-0.55
disclaim
-0.55
weather
-0.54
Booth
-0.54
uchi
-0.54
Attend
-0.54
POSITIVE LOGITS
packs
1.23
fired
1.17
fires
1.12
wards
1.11
stab
1.05
dated
1.04
packing
1.00
haul
0.99
tracking
0.94
doors
0.89
Activations Density 0.054%