INDEX
    Explanations

    phrases that indicate physical discomfort or adaptation

    New Auto-Interp
    Negative Logits
     Gerr
    -0.15
    strict
    -0.15
    rees
    -0.14
    andbox
    -0.14
    erra
    -0.14
    zza
    -0.14
    open
    -0.14
    μÏĢ
    -0.14
    loor
    -0.14
     open
    -0.14
    POSITIVE LOGITS
     habit
    0.25
    ä¹ł
    0.25
     become
    0.23
    habit
    0.23
     Habit
    0.23
     пÑĢивÑĭ
    0.23
    Become
    0.21
    hab
    0.21
     hab
    0.21
    bec
    0.21
    Act Density 0.214%

    No Known Activations