Like Schema — Polymorphic Design
A "like" in our application can be applied to three different types of content: videos, comments, and tweets. Instead of creating three separate schemas (VideoLike, CommentLike, TweetLike), we use a polymorphic design with optional reference fields:
const likeSchema = new Schema(
{
likedBy: { type: Schema.Types.ObjectId, ref: 'User', required: true },
video: { type: Schema.Types.ObjectId, ref: 'Video' },
comment: { type: Schema.Types.ObjectId, ref: 'Comment' },
tweet: { type: Schema.Types.ObjectId, ref: 'Tweet' },
},
{ timestamps: true }
);
Each Like document has exactly one of the optional fields populated. A like on a video has video set. A like on a comment has comment set. A like on a tweet has tweet set.
Why not use refPath (dynamic reference)?
refPath is MongoDB's built-in polymorphic reference:
{
onModel: { type: String, enum: ['Video', 'Comment', 'Tweet'] },
onDocument: { type: Schema.Types.ObjectId, refPath: 'onModel' }
}
Both approaches work, but the three optional fields approach is more explicit and easier to query. With separate fields, indexing is straightforward and querying likes for a specific video is simply Like.find({ video: videoId }).
---
Like Schema Indexes
// Unique: a user can only like a video/comment/tweet once
likeSchema.index({ likedBy: 1, video: 1 }, { unique: true, sparse: true });
likeSchema.index({ likedBy: 1, comment: 1 }, { unique: true, sparse: true });
likeSchema.index({ likedBy: 1, tweet: 1 }, { unique: true, sparse: true });
// sparse: true means the index only includes documents where the field exists
---
Playlist Schema
A playlist contains an ordered list of videos and belongs to a user:
const playlistSchema = new Schema(
{
name: { type: String, required: true, trim: true },
description: { type: String, default: '' },
videos: [{ type: Schema.Types.ObjectId, ref: 'Video' }],
owner: { type: Schema.Types.ObjectId, ref: 'User', required: true },
isPublic: { type: Boolean, default: true },
},
{ timestamps: true }
);
playlistSchema.index({ owner: 1 });
Operations:
- Add video:
$addToSet(prevents duplicates) →{ $addToSet: { videos: videoId } } - Remove video:
$pull→{ $pull: { videos: videoId } }
---
Tweet Schema
Tweets are short text posts (similar to Twitter):
const tweetSchema = new Schema(
{
content: {
type: String,
required: [true, 'Tweet content is required'],
trim: true,
maxlength: [280, 'Tweet cannot exceed 280 characters'],
},
owner: { type: Schema.Types.ObjectId, ref: 'User', required: true },
},
{ timestamps: true }
);
tweetSchema.index({ owner: 1, createdAt: -1 });
---
Virtual Populate for Like Count on Video
Instead of storing likeCount as a field on the Video document (which would require updating it on every like/unlike), use a virtual populate to compute it on demand:
// On Video schema:
videoSchema.virtual('likeCount', {
ref: 'Like',
localField: '_id',
foreignField: 'video',
count: true, // Returns count instead of array of documents
});
// Usage:
const video = await Video.findById(id).populate('likeCount');
console.log(video.likeCount); // Number
For the profile to be consistent, also need:
videoSchema.set('toJSON', { virtuals: true });
videoSchema.set('toObject', { virtuals: true });
---
Why These Designs Are Flexible and Scalable
| Design Decision | Benefit |
|---|---|
| Polymorphic likes (3 optional fields) | Single schema handles 3 content types |
| Sparse unique indexes on likes | Enforce uniqueness without wasting index space on null fields |
$addToSet for playlist videos | Automatic deduplication at the database level |
Tweet maxlength: 280 | Character limit enforced at schema level |
| Virtual populate for counts | Count computed on demand, no stale count data |
{ owner: 1, createdAt: -1 } index on tweets | Fast "get user's tweets, newest first" query |
---
Aggregation Counts vs Virtual Populate vs Denormalised Count
| Approach | When to Use | Pros | Cons |
|---|---|---|---|
| Aggregation (real-time) | Complex queries combining count + other data | Always accurate | Aggregation overhead per request |
| Virtual populate | Simple count on demand | Clean API | Extra DB query |
| Denormalised count field | High-read, low-write scenarios | Fastest reads | Risk of stale data, must update atomically |
For a YouTube-like app with many concurrent likes, a real-time aggregation count is safest. For a low-volume system, virtual populate is simplest.