Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

File Handling in Python

Lesson 22 of 37 in the free Data Science notes on Siksha Sarovar, written by Rohit Jangra.

File Handling in Python

Definition: File handling refers to the ability to read from and write to files on the file system.In Data Science, you constantly work with files — CSVs, text files, JSON, configuration files, and log files.Understanding file handling in Python is essential.

---

Why File Handling Matters in Data Science

  • Loading datasets from CSV, JSON, or text files.
  • Saving model outputs, predictions, and reports.
  • Logging experiment results.
  • Reading configuration files for pipelines.

---

Opening a File

Syntax: file = open("filename", "mode")

File Modes:

ModeDescriptionCreates File?Overwrites?
"r"Read (default)❌ No (error if not found)❌ No
"w"Write✅ Yes✅ Yes (erases existing)
"a"Append✅ Yes❌ No (adds to end)
"x"Create (exclusive)✅ Yes (error if exists)❌ No
"r+"Read + Write❌ No❌ No
"b"Binary mode (add to above)——

---

Reading Files

Method 1: read() — Reads entire file as a single string.

file = open("data.txt", "r")
content = file.read()
print(content)
file.close()

Method 2: readline() — Reads one line at a time.

file = open("data.txt", "r")
line1 = file.readline()
line2 = file.readline()
file.close()

Method 3: readlines() — Reads all lines into a list.

file = open("data.txt", "r")
lines = file.readlines()   # ["line1\n", "line2\n", ...]
file.close()

---

Writing to Files

# Write mode (overwrites file)
file = open("output.txt", "w")
file.write("Hello, World!\n")
file.write("Data Science is fun!")
file.close()

# Append mode (adds to end)
file = open("output.txt", "a")
file.write("\nNew line appended!")
file.close()

---

The with Statement (Best Practice)

The with statement automatically closes the file when the block is exited, even if an error occurs. Always use with for file handling.

with open("data.txt", "r") as file:
    content = file.read()
    print(content)
# File is automatically closed here

---

Reading & Writing Comparison

OperationMethodDescription
Read entire filefile.read()Returns one big string
Read one linefile.readline()Returns next line
Read all linesfile.readlines()Returns list of lines
Write stringfile.write(str)Writes a string
Write listfile.writelines(list)Writes a list of strings

---

Working with CSV Files

CSV (Comma-Separated Values) is the most common data format in Data Science.

Using the csv module:

import csv

# Reading CSV
with open("data.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)  # Each row is a list

# Writing CSV
with open("output.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerow(["Name", "Age", "City"])
    writer.writerow(["Rahul", 21, "Delhi"])

Using Pandas (Preferred in Data Science):

import pandas as pd
df = pd.read_csv("data.csv")     # Read
df.to_csv("output.csv", index=False)  # Write

---

Working with JSON Files

JSON (JavaScript Object Notation) is commonly used in APIs and web data.

import json

# Reading JSON
with open("data.json", "r") as file:
    data = json.load(file)    # Returns dict or list

# Writing JSON
with open("output.json", "w") as file:
    json.dump(data, file, indent=4)

---

File Handling Summary Table

FormatModuleReadWrite
Text (.txt)Built-inopen().read()open().write()
CSV (.csv)csv / pandascsv.reader() / pd.read_csv()csv.writer() / df.to_csv()
JSON (.json)jsonjson.load()json.dump()
Excel (.xlsx)pandas / openpyxlpd.read_excel()df.to_excel()
Pickle (.pkl)picklepickle.load()pickle.dump()

Summary

  • File handling allows reading from and writing to files on disk.
  • Always use with open() to ensure files are properly closed.
  • File modes (r, w, a, x) determine the operation and behavior.
  • CSV and JSON are the most common formats in Data Science.
  • Pandas provides the simplest interface for reading/writing tabular data.