2.1 The address structures — the API's luggage
Every socket function passes addresses via structures. IPv4:
struct in_addr {
in_addr_t s_addr; /* 32-bit IPv4 address, network byte order */
};
struct sockaddr_in {
sa_family_t sin_family; /* AF_INET */
in_port_t sin_port; /* 16-bit port, NETWORK byte order */
struct in_addr sin_addr; /* 32-bit address */
char sin_zero[8]; /* padding, always zero */
};
IPv6 uses struct sockaddr_in6 (sin6_family = AF_INET6, 128-bit sin6_addr, plus flow & scope fields). The generic structure exists only so one function signature fits all families:
struct sockaddr { /* generic — what the API functions take */
sa_family_t sa_family;
char sa_data[14];
};
/* every call needs the cast: */
connect(fd, (struct sockaddr *)&servaddr, sizeof(servaddr));
(struct sockaddr_storage is the modern generic version — big and aligned enough for any family.) The constant pairing in every call: a pointer to a sockaddr + its length.
2.2 Value-result arguments — the exam's favourite subtlety
When an address travels process → kernel (bind, connect, sendto), the length is passed by value — the kernel just needs to know how much to copy.
When an address travels kernel → process (accept, recvfrom, getsockname, getpeername), the length is passed by reference — a value-result argument:
struct sockaddr_in cliaddr;
socklen_t len = sizeof(cliaddr); /* VALUE: buffer size we provide */
int connfd = accept(listenfd, (struct sockaddr *)&cliaddr, &len);
/* RESULT: len now holds the actual number of bytes the kernel stored */
Going in, it tells the kernel "this much room"; coming back, it says "this much was actually written". Forgetting to (re)initialise len before each call is a classic bug.
2.3 Byte ordering — big-endian wire, whatever-endian host
A 16-bit value 0x1234 can be stored big-endian (0x12 first — "network byte order") or little-endian (0x34 first — x86 hosts). The Internet protocols mandate big-endian for every multi-byte header field. Hence the conversion quartet:
| Function | Meaning | Used for |
|---|---|---|
htons() | host to network short (16-bit) | ports |
htonl() | host to network long (32-bit) | addresses |
ntohs() | network to host short | reading ports |
ntohl() | network to host long | reading addresses |
servaddr.sin_port = htons(9877); /* ALWAYS */
servaddr.sin_addr.s_addr = htonl(INADDR_ANY); /* 0.0.0.0 = any interface */
Omitting htons on x86 makes the server listen on the byte-swapped port — it "works on the test that never checks the port number" and fails mysteriously in the real world. On big-endian machines these functions are no-ops; we call them always, for portability.
2.4 Byte manipulation & address conversion functions
Byte (not string!) operations for binary data that may contain zeros: memset, memcpy, memcmp (modern) — bzero, bcopy, bcmp (Berkeley heritage, still in older code).
Presentation ↔ numeric address conversion:
| Function | Direction | Families |
|---|---|---|
inet_aton, inet_addr (deprecated), inet_ntoa | dotted-decimal ↔ 32-bit | IPv4 only |
inet_pton(af, str, buf) | presentation → numeric | IPv4 & IPv6 |
inet_ntop(af, buf, str, size) | numeric → presentation | IPv4 & IPv6 |
inet_pton(AF_INET, "192.168.1.10", &servaddr.sin_addr); /* string → binary */
char ip[INET_ADDRSTRLEN];
inet_ntop(AF_INET, &cliaddr.sin_addr, ip, sizeof(ip)); /* binary → string */
Use the modern pair in all new code — protocol-independent and thread-safe (inet_ntoa returns a static buffer — a hidden race).
2.5 readn / writen — handling short counts on streams
A stream-socket read may return fewer bytes than asked (data still in flight); write to a non-blocking socket may likewise be short. The standard wrappers loop until done — write them once, use them everywhere (and quote them in answers):
ssize_t writen(int fd, const void *buf, size_t n) {
size_t left = n; const char *p = buf;
while (left > 0) {
ssize_t w = write(fd, p, left);
if (w <= 0) {
if (w < 0 && errno == EINTR) continue; /* interrupted: retry */
return -1;
}
left -= w; p += w;
}
return n;
}
readn mirrors it; readline reads byte-wise (or buffered) until '\n' — needed because the stream has no message boundaries (lesson 1.3's warning made concrete).
2.6 IPv4 vs IPv6 address structures, side by side
"Compare sockaddr_in and sockaddr_in6" is a standing exam item. The IPv6 structure in full:
struct in6_addr {
uint8_t s6_addr[16]; /* 128-bit IPv6 address, network byte order */
};
struct sockaddr_in6 {
sa_family_t sin6_family; /* AF_INET6 */
in_port_t sin6_port; /* port, network byte order */
uint32_t sin6_flowinfo; /* flow label (low 20 bits) */
struct in6_addr sin6_addr; /* 128-bit address */
uint32_t sin6_scope_id; /* interface index for link-local addrs */
};
| Feature | sockaddr_in (IPv4) | sockaddr_in6 (IPv6) |
|---|---|---|
| family constant | AF_INET | AF_INET6 |
| address size | 32 bits (in_addr) | 128 bits (in6_addr) |
| total struct size | 16 bytes | 28 bytes — bigger than generic sockaddr! |
| wildcard | INADDR_ANY (a value: 0.0.0.0) | in6addr_any (a variable — can't be a constant, it's a struct) |
| loopback | INADDR_LOOPBACK (127.0.0.1) | in6addr_loopback (::1) |
| extra fields | sin_zero padding | flowinfo + scope_id (which interface, for fe80:: link-local) |
The size row carries the punchline: a sockaddr_in6 does not fit inside the 16-byte generic struct sockaddr — which is why struct sockaddr_storage exists (big enough and aligned for every family). Modern rule: declare buffers as sockaddr_storage, cast pointers to sockaddr.
2.7 Why the cast circus? (the design question)
Why does every call take struct sockaddr plus a length, forcing casts everywhere? Because sockets predate void (ANSI C, 1989) — the API needed some typed pointer that could stand for "any address family", so a generic struct was invented and casting became the convention. The length argument exists because different families have different sizes — the kernel uses (family, length) to interpret the bytes. If sockets were designed today the signature would be connect(fd, const void addr, socklen_t len). Examiners enjoy this one-paragraph history question because it tests whether you understand the mechanism* (family field + explicit length = manual polymorphism in C).
2.8 Worked example: filling a server address, every byte accounted for
struct sockaddr_in servaddr;
memset(&servaddr, 0, sizeof(servaddr)); /* 1: zero everything (sin_zero!) */
servaddr.sin_family = AF_INET; /* 2: host byte order — never converted */
servaddr.sin_port = htons(9877); /* 3: 16-bit, NETWORK order */
servaddr.sin_addr.s_addr = htonl(INADDR_ANY); /* 4: 32-bit, NETWORK order */
Line-by-line viva traps: (1) skipping memset leaves sin_zero dirty — some stacks reject the bind; (2) sin_family is never byte-swapped — it never travels on the wire, it only tells this kernel how to read the struct; (3) forgetting htons is invisible on big-endian machines and a port-number disaster on x86; (4) INADDR_ANY is defined as 0, so htonl is technically a no-op here — we still write it, because the rule ("every multi-byte value crossing into a sockaddr is network order") matters more than the special case.
On the wire, port 9877 (0x2695) is stored as the bytes 0x26 0x95 in that order — big end first. Sketching those two bytes for both endiannesses is the standard way to prove you understand byte order rather than recite it.
Exam pointers
- "Explain value-result arguments with an example" — the accept() fragment in §2.2 is the expected answer; name the four kernel→process functions that use them.
- "Why are htons/htonl needed?" — define both byte orders, state that the wire is big-endian, show the 0x1234 byte layouts, end with the portability argument (no-ops on big-endian, mandatory style anyway).
- "Write readn/writen and explain why they are necessary" — quote §2.5; the why (short counts on streams) is worth as much as the code.
Check yourself
- Which functions pass a socket address into the kernel, and which receive one out? Which group needs value-result lengths?
- Why can
in6addr_anynot be a #define'd integer constant like INADDR_ANY? - A colleague stores a peer address in
struct sockaddrand IPv6 clients fail mysteriously. Diagnose. - Why is
inet_ntoaunsafe in threaded programs, and which call replaces it? - write() returns 1300 when you asked for 4096. Is errno set? What must your code do next?