3.1 The call ladder
This diagram is the exam answer for "explain the socket functions used by a TCP client and server".
3.2 The functions, one by one
int socket(int family, int type, int protocol) — create an endpoint. Family: AF_INET (IPv4), AF_INET6, AF_LOCAL/AF_UNIX, AF_ROUTE. Type: SOCK_STREAM (TCP), SOCK_DGRAM (UDP), SOCK_RAW (Unit 4). Protocol: usually 0. Returns a descriptor or −1.
**int connect(int fd, const struct sockaddr addr, socklen_t len)* — client: start the three-way handshake; returns when it completes or fails. The three failure stories (memorise — they reappear in Unit 2):
| Outcome | Cause | errno |
|---|---|---|
| no SYN-ACK after retries (~75 s) | host down / unreachable network | ETIMEDOUT |
| RST received | host up, no process listening on the port | ECONNREFUSED |
| ICMP "destination unreachable" | routing failure | EHOSTUNREACH / ENETUNREACH |
Note: the client does not call bind — the kernel picks an ephemeral port automatically.
**int bind(int fd, const struct sockaddr addr, socklen_t len)* — assign the local (IP, port). Servers bind their well-known port; INADDR_ANY = accept on every interface; port 0 = kernel chooses. Binding a port < 1024 needs privilege. Classic error: EADDRINUSE (often a TIME_WAIT leftover — cured by SO_REUSEADDR, Unit 2).
int listen(int fd, int backlog) — convert the socket to passive (LISTEN state). The kernel keeps two queues: the incomplete queue (SYN received, handshake in progress, SYN_RCVD) and the completed queue (ESTABLISHED, waiting for accept). backlog bounds their sum; when full, new SYNs are ignored (the client retries). SYN-flood attacks aim exactly at filling the incomplete queue.
**int accept(int fd, struct sockaddr cliaddr, socklen_t len) — dequeue one completed connection. Returns a brand-new connected descriptor; the listening descriptor stays open** for further clients. One listening socket, many connected sockets — the pair (value-result len!) identifies the client.
close(fd) — decrement the descriptor's reference count; when it hits 0, start the four-segment FIN sequence. Reference counting is the reason fork-based servers must be disciplined about closing (below).
3.3 fork and exec — the concurrency primitives
pid_t pid = fork(); /* called ONCE, returns TWICE: */
/* child: returns 0 */
/* parent: returns child's PID */
The child is a copy of the parent — crucially, it shares all open descriptors (the kernel just bumps reference counts). exec (6 variants: execl, execv, execle, execve, execlp, execvp) replaces the process image with a new program — descriptors stay open across exec, which is how inetd (Unit 3) hands sockets to the programs it launches.
3.4 The concurrent server pattern (the code to memorise)
An iterative server serves one client at a time — fine for daytime, useless for long conversations. The classic concurrent server forks one child per client:
listenfd = socket(AF_INET, SOCK_STREAM, 0);
bind(listenfd, ...); listen(listenfd, LISTENQ);
for (;;) {
connfd = accept(listenfd, (struct sockaddr *)&cli, &len);
if (fork() == 0) { /* ---- CHILD ---- */
close(listenfd); /* child doesn't accept */
doit(connfd); /* serve THIS client */
close(connfd);
exit(0);
}
close(connfd); /* ---- PARENT: must close! ---- */
}
Why the parent's close(connfd) is essential: after fork the connected socket's reference count is 2. The parent closes its copy (count → 1, connection stays alive for the child). If the parent forgot: count never reaches 0, no FIN is ever sent, descriptors leak until the server dies. A two-mark trap question.
The status ladder during a connection — descriptors by process:
| Moment | parent fds | child fds |
|---|---|---|
| after accept | listenfd, connfd | — |
| after fork | listenfd, connfd | listenfd, connfd |
| after both closes | listenfd | connfd |
getsockname / getpeername complete the elementary API: they return the local / remote address bound to a socket — needed e.g. after exec, when the new program inherits a connected descriptor but not the address variables, or to learn which IP the kernel chose on a multihomed host.
3.5 Wrapper functions and error style
Stevens' code wraps every call in a capitalised version (Socket, Bind, Accept...) that checks the return and dies with a message on error. Production code distinguishes fatal setup errors from per-connection errors (ECONNABORTED, EINTR around accept → just continue the loop). Either way: check every return value — networking is the land where everything fails eventually.
3.6 Inside connect(): three failures on the wire, blow by blow
What §3.2's errno table summarises deserves a wire-level narration — this exact walkthrough is a classic 8–10 mark question:
**Case A — connecting to a dead or unplugged host. The client kernel sends SYN. Nothing answers — no host, no RST, nothing. TCP retransmits the SYN with exponential backoff (classically at ~6 s, then ~24 s more); after roughly 75 seconds total it gives up and connect() returns ETIMEDOUT. Note what the application experienced: a connect() call that blocked for over a minute** — the motivation for nonblocking connects with select-timeouts in real clients.
**Case B — host is up, but no process listens on the port. The SYN arrives; the kernel finds no socket in LISTEN for that port; it answers immediately with RST**. connect() fails at once with ECONNREFUSED. Fast failure is the giveaway: refused ≈ instant, timed-out ≈ 75 s. (This is hard failure; contrast UDP, where the equivalent is an ICMP port-unreachable that an unconnected socket never even hears about — Unit 2.)
**Case C — a router gives up. No route to the destination: some router sends ICMP destination unreachable**; the kernel may retry for a while, then connect() fails with EHOSTUNREACH/ENETUNREACH (soft error — it waits, because routing might heal).
| Failure | What comes back | Speed | errno |
|---|---|---|---|
| Host down | nothing (SYN retransmits die) | ~75 s | ETIMEDOUT |
| Port closed | RST | instant | ECONNREFUSED |
| No route | ICMP unreachable | after retries | EHOSTUNREACH / ENETUNREACH |
One more connect() property worth a line: it can be called only once per socket — a failed connect leaves the socket unusable; close it and make a new one before retrying.
3.7 The listen backlog, drawn
Facts examiners reward: the handshake completes before accept is ever called (a connection can be fully ESTABLISHED while the server is busy elsewhere — data sent by an eager client is simply queued); when the queues are full the kernel ignores the SYN rather than sending RST (so the client's retransmit can succeed once room appears); a SYN flood fills the incomplete queue with spoofed-source SYNs whose final ACK never comes — defences include SYN cookies. Backlog of 5 was the historical default; busy servers use hundreds.
3.8 Iterative vs concurrent vs preforked — the server taxonomy
| Architecture | How it works | Cost per client | Right for |
|---|---|---|---|
| Iterative | one loop: accept → serve → close → repeat | zero extra | tiny exchanges (daytime): total service time ≈ one RTT |
| Concurrent (fork per client) | accept → fork → child serves | one fork (~ms) | the classic; isolates clients; this lesson's code |
| Preforked | N children forked at startup, all blocking in accept on the shared listenfd | fork cost paid in advance | busy servers (classic Apache); kernel distributes connections |
| Threaded | thread per client, or thread pool | cheaper than fork; shared memory (locking!) | when state must be shared |
| select/event loop | one process watches all sockets | no process per client at all | huge fan-in (Unit 2's multiplexing lesson) |
The preforked detail worth quoting: all N children call accept on the same listening descriptor — this is legal precisely because of descriptor sharing across fork (§3.3); the kernel wakes one child per connection (historically all of them — the "thundering herd" problem, since fixed).
Exam pointers
- "Describe the socket system calls for TCP client and server" — the §3.1 ladder + one paragraph per function from §3.2; include each prototype.
- "What is the role of backlog in listen()?" — two queues, their states, sum bounded by backlog, SYN-flood connection. The two-queue distinction is the marks discriminator.
- "Why must the parent close connfd in a concurrent server?" — reference counting; the table in §3.4 is the cleanest answer format.
- Trap question: "Does accept create a new port for the new connection?" No — same local port; the kernel demultiplexes established connections by the full 4-tuple. Saying "new port" is the classic wrong answer.
Check yourself
- connect() to host X fails in under a millisecond; to host Y it fails after a minute. What is each host's situation?
- Five clients connect to a server that hasn't called accept yet. Where are those connections? Can the clients already send data?
- In the fork server, what exactly does the child's close(listenfd) accomplish — and what breaks if it's omitted? (Subtler than the parent's close!)
- Why can ten thousand established connections all share server port 80?
- After fork, which process should call getpeername if exec is about to replace the child's image — and why might it need to?