Skip to content

Error resetting euid #10

@danpovey

Description

@danpovey

Guys,
does anyone know what the likely cause is when you get an error at the following code line?

https://github.com/gridengine/gridengine/blob/master/source/daemons/shepherd/shepherd.c#L325

We had a working GridEngine setup and it has broken after updating from Debian 8 to 9.
We get errors like:
Cannot reset euid dpovey due to Operation not permitted
when a user's qlogin gets scheduled on a node.

Job 8883854 caused action: none
 User        = dpovey
 Queue       = [email protected]
 Start Time  = <unknown>
 End Time    = <unknown>
failed in prolog: 04/30/2018 15:52:43 [119:9698]: exit_status of prolog = 143
Shepherd trace:
04/30/2018 15:52:43 [0:9698]: shepherd called with uid = 0, euid = 0
04/30/2018 15:52:43 [0:9698]: qlogin_daemon = builtin
04/30/2018 15:52:43 [119:9698]: starting up 8.1.9
04/30/2018 15:52:43 [119:9698]: setpgid(9698, 9698) returned 0
04/30/2018 15:52:43 [119:9698]: do_core_binding: "binding" parameter not found in config file
04/30/2018 15:52:43 [119:9698]: calling fork_pty()
04/30/2018 15:52:43 [119:9698]: parent: forked "prolog" with pid 9700
04/30/2018 15:52:43 [119:9698]: using signal delivery delay of 120 seconds
04/30/2018 15:52:43 [119:9698]: parent: prolog-pid: 9700
04/30/2018 15:52:43 [60410:9698]: Cannot reset euid dpovey due to Operation not permitted
04/30/2018 15:52:43 [60410:9698]: now sending signal TERM to pid -9700
04/30/2018 15:52:43 [60410:9698]: Cannot reset euid dpovey due to Operation not permitted
04/30/2018 15:52:43 [60410:9698]: now sending signal TERM to pid -9700
04/30/2018 15:52:43 [119:9698]: Poll received POLLHUP (Hang up). Unregister the FD.
04/30/2018 15:52:43 [119:9698]: wait3 returned 9700 (status: 15; WIFSIGNALED: 1,  WIFEXITED: 0, WEXITSTATUS: 0)
04/30/2018 15:52:43 [119:9698]: prolog exited with exit status 0
04/30/2018 15:52:43 [119:9698]: reaped "prolog" with pid 9700
04/30/2018 15:52:43 [119:9698]: prolog exited due to signal
04/30/2018 15:52:43 [119:9698]: prolog signaled: 15
04/30/2018 15:52:43 [119:9698]: exit_status of prolog = 143
04/30/2018 15:52:43 [119:9698]: no epilog script to start
04/30/2018 15:52:43 [119:9698]: writing exit status to qrsh: 0
04/30/2018 15:52:43 [119:9698]: sending UNREGISTER_CTRL_MSG with exit_status = "0"
04/30/2018 15:52:43 [119:9698]: sending to host: <null>
04/30/2018 15:52:43 [119:9698]: comm_write_message returned: can't find handle
04/30/2018 15:52:43 [119:9698]: close_parent_loop: comm_write_message() returned 0 instead of 1!!!
04/30/2018 15:52:43 [119:9698]: waiting for UNREGISTER_RESPONSE_CTRL_MSG
04/30/2018 15:52:43 [119:9698]: No connection or problem while waiting for message: 1
04/30/2018 15:52:43 [119:9698]: parent: cl_com_ignore_timeouts
04/30/2018 15:52:43 [119:9698]: parent: error in comm_cleanup_lib(): 3
04/30/2018 15:52:43 [119:9698]: parent: leaving close_parent_loop()

Shepherd error:
04/30/2018 15:52:43 [119:9698]: exit_status of prolog = 143

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions