INDEX
Explanations
phrases containing the word "out" with some preceding context
phrases indicating an exit or departure
New Auto-Interp
Negative Logits
arsen
-0.72
tyr
-0.71
Brach
-0.67
0004
-0.63
compr
-0.62
avorite
-0.61
EStream
-0.60
inical
-0.60
Khe
-0.59
destro
-0.58
POSITIVE LOGITS
stretched
1.35
lander
1.21
fitted
1.15
dated
1.08
flows
1.07
numbered
1.06
lier
1.05
smart
1.05
doors
1.05
landish
1.03
Activations Density 0.052%