clone, clone2 – linux man page

August 27th, 2009 | Tags: , , , , , , , , ,

clone, clone2 – create a child process

USAGE
       #include <sched.h>

       int clone(int (*fn)(void *arg), void *child_stack, int flags, void *arg);

       _syscall2(int, clone, int, flags, void *, child_stack)

       _syscall5(int, clone, int, flags, void *, child_stack,
            int *, parent_tidptr, struct user_desc *, newtls,
            int *, child_tidptr)
               /* Using syscall(2) may be preferable; see intro(2) */

       int __clone2(int (*fn)(void *arg), void *child_stack,
              size_t stack_size, int flags, void *arg);

       _syscall2(int, clone2, int, flags, void *, child_stack,
              int, child_stack_size, int *, parent_tidptr,
              struct user_desc *, newtls, int *, child_tidptr)

DESCRIPTION
       clone()  system  call can be use on all architectures except IA-64.  On
       IA-64 there is available command __clone2().   Both  commands  has  the
       similar  behavior  - they creates a new process, in a manner similar to
       fork(2).  clone() is a library function layered on top of the  underly-
       ing   clone()  system  call,  hereinafter  referred  to  as  sys_clone.
       __clone2() is exported from sys_clone2 system call.

       Unlike fork(2), these calls allow the child process to share  parts  of
       its  execution  context  with  the  calling process, such as the memory
       space, the table of file descriptors, and the table of signal handlers.
       (Note  that on this manual page, "calling process" normally corresponds
       to "parent process".  But see the description of CLONE_PARENT below.)

       The main use of clone() is to implement threads:  multiple  threads  of
       control in a program that run concurrently in a shared memory space.

       The  behavior  of __clone2() is the same as the behavior of clone() The
       difference is __clone2() has one extra  argument  stack_size  which  is
       used to determine the size of child stack. Other variables has the same
       meaning.

       When the child process is created with clone(), it executes  the  func-
       tion  application fn(arg).  (This differs from fork(2), where execution
       continues in the child from the point of the  fork(2)  call.)   The  fn
       argument is a pointer to a function that is called by the child process
       at the beginning of its execution.  The arg argument is passed  to  the
       fn function.

       When the fn(arg) function application returns, the child process termi-
       nates.  The integer returned by fn is the exit code for the child  pro-
       cess.   The  child  process  may  also  terminate explicitly by calling
       exit(2) or after receiving a fatal signal.

       The child_stack argument specifies the location of the  stack  used  by
       the  child process.  Since the child and calling process may share mem-
       ory, it is not possible for the child process to execute  in  the  same
       stack  as  the calling process.  The calling process must therefore set
       up memory space for the child stack and pass a pointer to this space to
       clone().   Stacks  grow  downwards  on  all  processors  that run Linux
       (except the HP PA processors), so child_stack  usually  points  to  the
       topmost address of the memory space set up for the child stack.

       The  low  byte  of  flags contains the number of the termination signal
       sent to the parent when the child dies.  If this signal is specified as
       anything  other  than SIGCHLD, then the parent process must specify the
       __WALL or __WCLONE options when waiting for the child with wait(2).  If
       no  signal  is  specified, then the parent process is not signaled when
       the child terminates.

       flags may also be bitwise-or’ed with zero or more of the following con-
       stants,  in order to specify what is shared between the calling process
       and the child process:

       CLONE_PARENT (since Linux 2.3.12)
              If CLONE_PARENT is set, then the parent of  the  new  child  (as
              returned  by getppid(2)) will be the same as that of the calling
              process.

              If CLONE_PARENT is not set, then (as with fork(2))  the  child’s
              parent is the calling process.

              Note  that  it is the parent process, as returned by getppid(2),
              which  is  signaled  when  the  child  terminates,  so  that  if
              CLONE_PARENT  is  set,  then  the parent of the calling process,
              rather than the calling process itself, will be signaled.

       CLONE_FS
              If CLONE_FS is set, the caller and the child processes share the
              same  file  system  information.   This includes the root of the
              file system, the current working directory, and the umask.   Any
              call  to chroot(2), chdir(2), or umask(2) performed by the call-
              ing process or the child process also affects the other process.

              If CLONE_FS is not set, the child process works on a copy of the
              file system information of the calling process at  the  time  of
              the  clone()  call.  Calls to chroot(2), chdir(2), umask(2) per-
              formed later by one of the processes do  not  affect  the  other
              process.

       CLONE_FILES
              If  CLONE_FILES  is  set, the calling process and the child pro-
              cesses share the same file descriptor table.  Any file  descrip-
              tor  created  by  the calling process or by the child process is
              also valid in the other process.  Similarly, if one of the  pro-
              cesses closes a file descriptor, or changes its associated flags
              (using the fcntl(2) F_SETFD operation),  the  other  process  is
              also affected.

              If  CLONE_FILES is not set, the child process inherits a copy of
              all file descriptors opened in the calling process at  the  time
              of clone().  (The duplicated file descriptors in the child refer
              to the same open file descriptions (see open(2)) as  the  corre-
              sponding  file  descriptors in the calling process.)  Subsequent
              operations that open or close file descriptors, or  change  file
              descriptor flags, performed by either the calling process or the
              child process do not affect the other process.

       CLONE_NEWNS (since Linux 2.4.19)
              Start the child in a new namespace.

              Every process lives in a namespace. The namespace of  a  process
              is the data (the set of mounts) describing the file hierarchy as
              seen by that process. After a  fork(2)  or  clone(2)  where  the
              CLONE_NEWNS  flag is not set, the child lives in the same names-
              pace as the parent.  The system  calls  mount(2)  and  umount(2)
              change  the  namespace  of the calling process, and hence affect
              all processes that live in the same namespace, but do not affect
              processes in a different namespace.

              After  a  clone(2) where the CLONE_NEWNS flag is set, the cloned
              child is started in a new namespace, initialized with a copy  of
              the namespace of the parent.

              Only a privileged process (one having the CAP_SYS_ADMIN capabil-
              ity) may specify the CLONE_NEWNS flag.  It is not  permitted  to
              specify  both CLONE_NEWNS and CLONE_FS in the same clone() call.

       CLONE_SIGHAND
              If CLONE_SIGHAND is set, the calling process and the child  pro-
              cesses  share the same table of signal handlers.  If the calling
              process or child process calls sigaction(2) to change the behav-
              ior  associated  with  a  signal, the behavior is changed in the
              other process as well.  However, the calling process  and  child
              processes  still  have distinct signal masks and sets of pending
              signals.  So, one of them may  block  or  unblock  some  signals
              using sigprocmask(2) without affecting the other process.

              If  CLONE_SIGHAND  is not set, the child process inherits a copy
              of the signal handlers  of  the  calling  process  at  the  time
              clone() is called.  Calls to sigaction(2) performed later by one
              of the processes have no effect on the other process.

              Since Linux 2.6.0-test6, flags must  also  include  CLONE_VM  if
              CLONE_SIGHAND is specified

       CLONE_PTRACE
              If  CLONE_PTRACE  is specified, and the calling process is being
              traced, then trace the child also (see ptrace(2)).

       CLONE_UNTRACED (since Linux 2.5.46)
              If CLONE_UNTRACED is specified, then a  tracing  process  cannot
              force CLONE_PTRACE on this child process.

       CLONE_STOPPED (since Linux 2.6.0-test2)
              If CLONE_STOPPED is set, then the child is initially stopped (as
              though it was sent a SIGSTOP signal), and  must  be  resumed  by
              sending it a SIGCONT signal.

       CLONE_VFORK
              If  CLONE_VFORK  is set, the execution of the calling process is
              suspended until the child releases its virtual memory  resources
              via a call to execve(2) or _exit(2) (as with vfork(2)).

              If  CLONE_VFORK is not set then both the calling process and the
              child are schedulable after the call, and an application  should
              not rely on execution occurring in any particular order.

       CLONE_VM
              If  CLONE_VM is set, the calling process and the child processes
              run in the same memory space.  In particular, memory writes per-
              formed  by  the calling process or by the child process are also
              visible in the other process.  Moreover, any memory  mapping  or
              unmapping  performed  with  mmap(2) or munmap(2) by the child or
              calling process also affects the other process.

              If CLONE_VM is not set, the child process  runs  in  a  separate
              copy  of  the memory space of the calling process at the time of
              clone().  Memory writes or file mappings/unmappings performed by
              one of the processes do not affect the other, as with fork(2).

       CLONE_PID (obsolete)
              If  CLONE_PID is set, the child process is created with the same
              process ID as the calling process. This is good for hacking  the
              system,  but  otherwise  of not much use. Since 2.3.21 this flag
              can be specified only by the system boot process  (PID  0).   It
              disappeared in Linux 2.5.16.

       CLONE_THREAD (since Linux 2.4.0-test8)
              If  CLONE_THREAD  is set, the child is placed in the same thread
              group as the calling process.  To make the remainder of the dis-
              cussion of CLONE_THREAD more readable, the term "thread" is used
              to refer to the processes within a thread group.

              Thread groups were a feature added in Linux 2.4 to  support  the
              POSIX  threads  notion  of  a set of threads that share a single
              PID.  Internally, this shared PID is the so-called thread  group
              identifier  (TGID) for the thread group.  Since Linux 2.4, calls
              to getpid(2) return the TGID of the caller.

              The threads within a group can be distinguished by  their  (sys-
              tem-wide) unique thread IDs (TID).  A new thread’s TID is avail-
              able as the function result returned to the caller  of  clone(),
              and a thread can obtain its own TID using gettid(2).

              When  a call is made to clone() without specifying CLONE_THREAD,
              then the resulting thread is placed in a new thread group  whose
              TGID is the same as the thread’s TID.  This thread is the leader
              of the new thread group.

              A new thread created with CLONE_THREAD has the same parent  pro-
              cess as the caller of clone() (i.e., like CLONE_PARENT), so that
              calls to getppid(2) return the same value for all of the threads
              in  a  thread group.  When a CLONE_THREAD thread terminates, the
              thread that created it using clone() is not sent a  SIGCHLD  (or
              other  termination)  signal; nor can the status of such a thread
              be obtained using wait(2).  (The thread is said to be detached.)

              After  all of the threads in a thread group terminate the parent
              process of the thread group is sent a SIGCHLD (or other termina-
              tion) signal.

              If  any  of the threads in a thread group performs an execve(2),
              then all threads other than the thread group leader  are  termi-
              nated,  and  the  new  program  is  executed in the thread group
              leader.

              If one of the threads in a thread group creates  a  child  using
              fork(2),  then  any  thread  in  the  group can wait(2) for that
              child.

              Since Linux 2.5.35, flags must  also  include  CLONE_SIGHAND  if
              CLONE_THREAD is specified.

              Signals  may be sent to a thread group as a whole (i.e., a TGID)
              using kill(2),  or  to  a  specific  thread  (i.e.,  TID)  using
              tgkill(2).

              Signal  dispositions  and actions are process-wide: if an unhan-
              dled signal is delivered to a thread, then it will affect  (ter-
              minate, stop, continue, be ignored in) all members of the thread
              group.

              Each thread has its own signal mask, as set  by  sigprocmask(2),
              but  signals can be pending either: for the whole process (i.e.,
              deliverable to any member of the thread group), when  sent  with
              kill(2);  or for an individual thread, when sent with tgkill(2).
              A call to sigpending(2) returns a signal set that is  the  union
              of  the  signals  pending  for the whole process and the signals
              that are pending for the calling thread.

              If kill(2) is used to send a signal to a thread group,  and  the
              thread  group  has  installed a handler for the signal, then the
              handler will be invoked in  exactly  one,  arbitrarily  selected
              member  of the thread group that has not blocked the signal.  If
              multiple threads in a group are waiting to accept the same  sig-
              nal using sigwaitinfo(2), the kernel will arbitrarily select one
              of these threads to receive a signal sent using kill(2).

       CLONE_SYSVSEM (since Linux 2.5.10)
              If CLONE_SYSVSEM is set, then the child and the calling  process
              share  a  single  list  of  System  V semaphore undo values (see
              semop(2)).  If this flag is not set, then the child has a  sepa-
              rate undo list, which is initially empty.

       CLONE_SETTLS (since Linux 2.5.32)
              The  newtls  parameter  is  the  new  TLS (Thread Local Storage)
              descriptor.  (See set_thread_area(2).)

       CLONE_PARENT_SETTID (since Linux 2.5.49)
              Store child thread ID at location parent_tidptr  in  parent  and
              child   memory.   (In  Linux  2.5.32-2.5.48  there  was  a  flag
              CLONE_SETTID that did this.)

       CLONE_CHILD_SETTID (since Linux 2.5.49)
              Store child thread ID at location child_tidptr in child  memory.

       CLONE_CHILD_CLEARTID (since Linux 2.5.49)
              Erase  child  thread ID at location child_tidptr in child memory
              when the child exits, and do a  wakeup  on  the  futex  at  that
              address.    The   address   involved   may  be  changed  by  the
              set_tid_address(2)  system  call.  This  is  used  by  threading
              libraries.

   sys_clone
       The  sys_clone  system call corresponds more closely to fork(2) in that
       execution in the child continues from the point  of  the  call.   Thus,
       sys_clone only requires the flags and child_stack arguments, which have
       the same meaning as for clone().  (Note that the order of  these  argu-
       ments differs from clone().)

       Another  difference  for sys_clone is that the child_stack argument may
       be zero, in which case copy-on-write semantics ensure  that  the  child
       gets  separate  copies  of stack pages when either process modifies the
       stack.  In this case, for correct operation, the CLONE_VM option should
       not be specified.

       Since  Linux  2.5.49  the system call has five parameters.  The two new
       parameters are parent_tidptr which points to the  location  (in  parent
       and  child  memory)  where  the child thread ID will be written in case
       CLONE_PARENT_SETTID was specified, and child_tidptr which points to the
       location (in child memory) where the child thread ID will be written in
       case CLONE_CHILD_SETTID was specified.

RETURN VALUE
       On success, the thread ID of the  child  process  is  returned  in  the
       caller’s thread of execution.  On failure, a -1 will be returned in the
       caller’s context, no child process will be created, and errno  will  be
       set appropriately.

ERRORS
       EAGAIN Too many processes are already running.

       EINVAL CLONE_SIGHAND  was specified, but CLONE_VM was not. (Since Linux
              2.6.0-test6.)

       EINVAL CLONE_THREAD was specified, but CLONE_SIGHAND  was  not.  (Since
              Linux 2.5.35.)

       EINVAL Both CLONE_FS and CLONE_NEWNS were specified in flags.

       EINVAL Returned   by  clone()  when  a  zero  value  is  specified  for
              child_stack.

       ENOMEM Cannot allocate sufficient memory to allocate a  task  structure
              for  the  child,  or to copy those parts of the caller’s context
              that need to be copied.

       EPERM  CLONE_NEWNS was specified by a non-root process (process without
              CAP_SYS_ADMIN).

       EPERM  CLONE_PID was specified by a process other than process 0.

AVAILABILITY
       There  is  no  entry  for clone() in libc5.  glibc2 provides clone() as
       described in this manual page.

NOTES
       In the kernel 2.4.x series, CLONE_THREAD generally does  not  make  the
       parent of the new thread the same as the parent of the calling process.
       However, for kernel versions 2.4.7  to  2.4.18  the  CLONE_THREAD  flag
       implied the CLONE_PARENT flag (as in kernel 2.6).

       On  x86,  clone()  should  not be called through vsyscall, but directly
       through int $0x80.

CONFORMING TO
       The clone() and sys_clone calls are Linux-specific and  should  not  be
       used in programs intended to be portable.

BUGS
       Versions  of  the GNU C library that include the NPTL threading library
       contain a wrapper function for getpid() that performs caching of  PIDs.
       In programs linked against such libraries, calls to getpid() may return
       the  same  value,  even  when  the  threads  were  not  created   using
       CLONE_THREAD  (and  thus are not in the same thread group).  To get the
       truth, it may be necessary to use code such as the following

           #include 

           pid_t mypid;

           mypid = syscall(SYS_getpid);

SEE ALSO
       fork(2),   futex(2),    getpid(2),    gettid(2),
       tkill(2),  unshare(2),  wait(2),  capabilities(7),
       pthreads(7)

Comments are closed.