Puppet: System Administration Automated

Support

Ticket #565 (closed defect: fixed)

Opened 2 years ago

Last modified 2 years ago

Puppet::Util#execute blocks indefinitely

Reported by: mccune Assigned to: mccune
Priority: normal Milestone:
Component: library Version:
Severity: normal Keywords: exec zombie block service hang
Cc: Triage Stage:
Attached Patches: Complexity:

Description

Summary

Puppet::Util#execute may block indefinitely trying to read the pipe connected to the child process it spawns.

Impact Data

Fails consistently trying to manage /usr/sbin/automount on Mac OS X.

Related Issues

Related to #410 (Puppet::Util#execute should support a timeout)

Expected Behavior

The parent puppet process should detect the child's death and proceed.

Actual Behavior

The parent seems to block indefinitely at f.read in this section of code: source:trunk/lib/puppet/util.rb

        IO.popen("-") do |f|
            if f
                output = f.read
            else
                begin
                    $stdin.reopen("/dev/null")
                    $stderr.close
                    $stderr = $stdout.dup
                    if gid
                        Process.egid = gid
                        Process.gid = gid unless @@os == "Darwin"
                    end
                    if uid
                        Process.euid = uid
                        Process.uid = uid unless @@os == "Darwin"
                    end
                    if command.is_a?(Array)
                        Kernel.exec(*command)
                    else
                        Kernel.exec(command)
                    end
                rescue => detail
                    puts detail.to_s
                    exit!(1)
                end
            end
        end

Regression

Problem exists through 0.22.2.

Steps to Reproduce

On Mac OS X, run puppet as root with the following manifest:

file { "/tmp/auto_home": ensure => exists }
exec {"automount":
	command => "/usr/sbin/automount -tcp -m /tmp/home /tmp/auto_home -mnt /private/var/automount/tmp_home",
	require => File["/tmp/auto_home"]
}

Other Notes

I'm currently working on a patch. Test new code with:

cd trunk/test/util
./utiltest.rb

Expect 1 failure on Mac OS X and Redhat when attempting to change the UID for a process.

  3) Failure:
test_get_provider_value(TestPuppetUtil) [./utiltest.rb:342]:
got invalid uid for root.
<0> expected but was
<nil>.

Attachments

exec_hacking_01.patch (1.5 kB) - added by mccune on 03/29/07 00:58:46.
NOT WORKING: Passes utiltest.rb, avoids zombie, doesn't fix the problem, still blocks at f.read.

Change History

03/28/07 23:57:04 changed by mccune

  • status changed from new to assigned.

If you look at the process table while puppet is blocked, you'll see a zombie for the automount process:

root     16946   0.0 -0.0        0      0  p2  Z+   31Dec69   0:00.00 (automount)
root     16947   0.0 -0.0    29424    984  ??  Ss    5:54PM   0:00.01 /usr/sbin/automount -f -tcp -m /tmp/home /tmp/auto_home -mnt /private/var/automount/tmp_home

It appears the parent automount is forking a child, then killing itself, yet ruby does not reap the SIG_CHLD and continues trying to read from the pipe.

03/29/07 00:58:46 changed by mccune

  • attachment exec_hacking_01.patch added.

NOT WORKING: Passes utiltest.rb, avoids zombie, doesn't fix the problem, still blocks at f.read.

03/29/07 23:30:14 changed by mccune

I've isolated this down to the child killing off it's parent. Executing the following bash script from puppet triggers the problem:

#!/bin/bash
bash -c 'kill -TERM $PPID'

When this script is killed by it's own child, ruby continues trying to read from the pipe.

04/03/07 17:53:48 changed by mccune

  • status changed from assigned to closed.
  • resolution set to fixed.

Fixed in r2385