1.1 The DNS in five lines
Programs use names (www.example.com); packets need addresses. The DNS is the distributed, hierarchical database that maps between them. Resource records you must know:
| RR | Maps |
|---|---|
| A | name → 32-bit IPv4 address |
| AAAA ("quad A") | name → 128-bit IPv6 address |
| PTR | address → name (reverse: in-addr.arpa / ip6.arpa) |
| MX | mail exchanger + preference |
| CNAME | alias → canonical name |
| NS / SOA | delegation / zone authority |
Applications don't speak DNS directly: they call the resolver library, which sends UDP (fallback TCP) queries to the servers in /etc/resolv.conf; /etc/hosts and /etc/nsswitch.conf decide file-vs-DNS order.
1.2 gethostbyname — the classic lookup
struct hostent {
char *h_name; /* canonical name */
char **h_aliases; /* alias list */
int h_addrtype; /* AF_INET */
int h_length; /* 4 */
char **h_addr_list; /* LIST of addresses (multi-homed!) */
};
struct hostent *gethostbyname(const char *hostname);
- Returns NULL on error with the code in
h_errno(not errno!): HOST_NOT_FOUND, TRY_AGAIN, NO_RECOVERY, NO_DATA (name exists, e.g. MX-only, but no A record). - Walk
h_addr_listand try each address until connect succeeds — multihomed hosts are the rule, not the exception. - Limitations: IPv4-only mindset, returns a pointer to static storage (not thread-safe / not reentrant — the same caveat as inet_ntoa; the deep reason the modern API below exists).
gethostbyaddr is the reverse (address → hostent, via PTR records).
1.3 Service lookups, and uname/gethostname
getservbyname("ftp", "tcp") / getservbyport(htons(21), "tcp") consult /etc/services — code that says "ftp" instead of 21 survives port reassignments. Host identity: gethostname() returns the local hostname; uname() fills a struct utsname (sysname, nodename, release, version, machine) — the classic route from "who am I" to gethostbyname(nodename) for "what's my address".
1.4 The modern, protocol-independent pair: getaddrinfo & getnameinfo
The replacement API treats IPv4/IPv6 uniformly and is reentrant:
struct addrinfo hints, *res;
memset(&hints, 0, sizeof(hints));
hints.ai_family = AF_UNSPEC; /* v4 OR v6 */
hints.ai_socktype = SOCK_STREAM;
getaddrinfo("www.example.com", "http", &hints, &res);
/* res = linked list of ready-to-use (family, type, proto, sockaddr) */
for (p = res; p; p = p->ai_next) {
fd = socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (connect(fd, p->ai_addr, p->ai_addrlen) == 0) break;
close(fd);
}
freeaddrinfo(res);
- Name and service resolved in one shot; results plug directly into socket/connect/bind (server use:
hints.ai_flags = AI_PASSIVE, NULL host → INADDR_ANY). - Errors via the return value +
gai_strerror()— no h_errno. - getnameinfo is the exact inverse: sockaddr → name + service strings (flags: NI_NUMERICHOST, NI_NUMERICSERV for "don't resolve, just print").
This loop-over-results pattern is the canonical modern client; write it from memory.
1.5 IPv6 support in the API
sockaddr_in6,in6addr_any, loopback::1;inet_pton/inet_ntophandle both families (Unit 1).- Resolver option RES_USE_INET6 / hints.ai_family choose which records (A vs AAAA) you receive.
- IPv4-mapped IPv6 addresses (
::ffff:192.168.1.5) let a single AF_INET6 socket serve IPv4 clients too (unless IPV6_V6ONLY is set) — how dual-stack servers stay simple. - Address-testing macros: IN6_IS_ADDR_V4MAPPED, IN6_IS_ADDR_LOOPBACK, IN6_IS_ADDR_MULTICAST, ...
1.6 Which API in which code
| Situation | Use |
|---|---|
| new code, any code | getaddrinfo / getnameinfo |
| reading legacy programs & exams | gethostbyname/gethostbyaddr (know hostent & h_errno) |
| just printing an address | inet_ntop |
| "what services does /etc/services define?" | getservbyname/port |
1.7 What actually happens when you resolve a name — the full flow
"Explain DNS resolution with a diagram" wants the whole journey, not just the API:
Vocabulary the answer must use correctly: the application↔local-server query is recursive ("give me the final answer"); the local server's walk down the hierarchy is iterative (each level answers with a referral); answers carry a TTL and are cached at the local server — which is why the second lookup is instant and why DNS scales at all. Transport detail from Unit 1: queries ride UDP; responses too big (classically > 512 bytes) set the truncation bit and the resolver retries over TCP; zone transfers are always TCP.
1.8 getaddrinfo, argument by argument
int getaddrinfo(const char *hostname, /* name, address string, or NULL */
const char *service, /* "http", "80", or NULL */
const struct addrinfo *hints,
struct addrinfo **result);
hostname— a name to resolve, a numeric string ("192.168.1.5", "::1"), or NULL (with AI_PASSIVE: the wildcard).service— a service name (looked up in /etc/services) or a port number as a decimal string; resolved together with the host so each returned entry has the port already installed in its sockaddr.hints— a partially-filled addrinfo acting as a filter:ai_family(AF_INET / AF_INET6 / AF_UNSPEC),ai_socktype(without it you get up to three entries per address — stream, dgram, raw),ai_flags:
| Flag | Effect |
|---|---|
| AI_PASSIVE | results for bind() (wildcard address if hostname is NULL) — server mode |
| AI_CANONNAME | also return the canonical name in ai_canonname |
| AI_NUMERICHOST | hostname must be numeric — forbid DNS traffic entirely |
| AI_NUMERICSERV | service must be a number string |
| AI_V4MAPPED, AI_ALL | IPv6 sockets: map/include IPv4 results |
result— a malloc'd linked list of addrinfo (hence freeaddrinfo, not free); each node carries (ai_family, ai_socktype, ai_protocol) ready for socket() and (ai_addr, ai_addrlen) ready for connect/bind — the reason no casts or htons appear anywhere in the modern client.
1.9 gethostbyname vs getaddrinfo — the comparison table
| Criterion | gethostbyname | getaddrinfo |
|---|---|---|
| families | IPv4 (gethostbyname2 hacked in v6) | any, uniformly; AF_UNSPEC for both |
| also resolves service/port | no — pair it with getservbyname yourself | yes, in the same call |
| reentrancy | static buffer — not thread-safe | allocates per call — thread-safe |
| error reporting | h_errno globals | return code + gai_strerror |
| result format | hostent (raw address list) | ready-to-use sockaddr list |
| status | legacy/educational | the modern API |
Exam pointers
- "Explain the resource record types in DNS" — the §1.1 table; A vs AAAA vs PTR vs MX vs CNAME with one-line uses; mention in-addr.arpa for PTR.
- "Describe hostent / explain gethostbyname" — draw the structure with its double indirection (
h_addr_listis an array of pointers to addresses); state h_errno's four values; mention the multihomed loop-over-addresses discipline. - "Recursive vs iterative resolution" — the diagram above, with caching/TTL as the closing point.
- Code question: "write a protocol-independent client using getaddrinfo" — the §1.4 loop; practice until the hints initialisation is automatic.
Check yourself
- Which call resolves "www.example.com" and "https" into a connect-ready sockaddr in one step — and which three legacy calls did it replace?
- gethostbyname returns NULL and h_errno is NO_DATA. What exists in DNS for this name and what doesn't? (Hint: a mail-only domain.)
- Why must a client loop over all of h_addr_list / the addrinfo list instead of trying only the first address?
- Your threaded server corrupts hostnames intermittently when two threads resolve at once. Name the root cause and the fix.
- What does AI_PASSIVE + NULL hostname yield, and which call consumes the result — connect or bind?