Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

Practical 1: Regular Expressions (Modifiers, Operators, Metacharacters)

Lesson 2 of 35 in the free Web Based Programming Lab notes on Siksha Sarovar, written by Rohit Jangra.

Aim

To write and test regular expressions in PHP that demonstrate pattern modifiers, operators (quantifiers) and metacharacters using the PCRE preg_* function family.

Theory

PHP's regex engine is PCRE (Perl Compatible Regular Expressions), exposed through preg_match(), preg_match_all(), preg_replace() and preg_split(). A pattern is written between delimiters (commonly /pattern/) and may be followed by modifiers that change engine behaviour:

  • i — case-insensitive matching
  • m — multiline: ^ and $ anchor at every line boundary
  • s — dotall: . also matches newline
  • x — extended: whitespace inside the pattern is ignored (self-documenting patterns)
  • u — treat pattern and subject as UTF-8

Metacharacters carry special meaning: ^ (start anchor), $ (end anchor), . (any character), \d (digit), \w (word character), \s (whitespace), \b (zero-width word boundary), plus character classes like [a-z] and the negated form [^a-z]. Operators/quantifiers control repetition: * (0 or more), + (1 or more), ? (0 or 1), {n} / {n,m} (bounded), and | (alternation). Quantifiers are greedy by default; appending ? makes them lazy. Note the return contract: preg_match() returns 1 on match, 0 on no match and false on a malformed pattern — so === 1 is the robust production idiom.

Requirements

  • XAMPP/WAMP with PHP 8.x (or standalone PHP CLI)
  • Code editor (VS Code)
  • Browser (Chrome/Edge) or terminal

Procedure

  1. Start Apache from the XAMPP Control Panel.
  2. Create the folder C:\xampp\htdocs\wbplab and save the program as p01_regex.php inside it.
  3. Type the code from the snippet below and save.
  4. Run it at http://localhost/wbplab/p01_regex.php, or from the terminal with php p01_regex.php.
  5. Edit the $samples array, predict each MATCH/NO MATCH, then re-run to confirm.

Explanation of the Code

  • $patterns is an associative array mapping a human-readable label to a PCRE pattern. The first pattern /^web[a-z]\d{2}$/i anchors the whole string (^...$), requires the literal web, any run of letters ([a-z]), then exactly two digits (\d{2}); the i modifier is why WebDev23 matches despite its capital letters.
  • The second pattern /\d+/ uses the + operator — at least one digit anywhere in the subject.
  • The third pattern /\bphp\b/i uses the \b word-boundary metacharacter so php matches only as a whole word, never inside a longer token like phpMyAdmin.
  • The nested foreach feeds every string in $samples to preg_match() and prints MATCH/NO MATCH through a ternary expression.
  • Finally, preg_replace('/[^a-z\s]/i', '', $sentence) deletes every character that is not a letter or whitespace (a negated character class), stripping punctuation from $sentence.

Expected Output

For each labelled pattern the script prints one line per sample: WebDev23 => MATCH (pattern 1), course123 => MATCH and WebDev23 => MATCH (pattern 2, both contain digits), and I love PHP and regex => MATCH for the word-boundary pattern (the other two samples say NO MATCH). The final lines show Original: PHP, Python, and JavaScript! and After preg_replace: PHP Python and JavaScript — commas and the exclamation mark removed, spaces preserved.

🎯 Viva Questions

  1. Why does a PCRE pattern need delimiters? They separate the pattern body from trailing modifiers such as i or m.
  2. Difference between preg_match() and preg_match_all()? The former stops at the first match; the latter collects every match into $matches.
  3. What does \b match? A zero-width position between a word and a non-word character — it consumes no text.
  4. Greedy vs lazy quantifier? Greedy (.) takes the longest possible match; lazy (.?) the shortest.
  5. What does [^a-z] mean? A negated class — any single character that is not a lowercase letter.
  6. Why compare preg_match() with === 1? It can also return false on pattern error, which a loose truthiness check would mishandle.

CO Mapping

CO1, CO2