Clean Code & Code Smells
Clean Code & Code Smells¶
Topic: Programming Lesson: 8 of 16 Prerequisites: Programming experience in at least one language, familiarity with functions and classes Objective: Learn to write clean, maintainable code by following best practices, recognizing code smells, and applying effective refactoring techniques
Introduction¶
"Any fool can write code that a computer can understand. Good programmers write code that humans can understand." — Martin Fowler
Clean code is code that is easy to read, understand, and modify. It minimizes cognitive load, reduces bugs, and makes collaboration easier. This lesson covers principles from Robert C. Martin's Clean Code, Martin Fowler's Refactoring, and other industry best practices.
The Cost of Bad Code¶
Technical debt accumulates when we take shortcuts:
# Technical debt example
def process(data): # What does this process?
x = [] # What is x?
for d in data: # Single-letter variable
if d[0] > 100: # Magic number
x.append(d[1] * 1.2) # What is 1.2?
return x
# The "interest" we pay:
# - Hard to understand
# - Hard to modify
# - Hard to test
# - Bugs hide easily
The clean version:
PRICE_MARKUP = 1.2
MINIMUM_QUANTITY = 100
def calculate_marked_up_prices(orders):
"""Extract prices from high-quantity orders and apply markup."""
marked_up_prices = []
for order in orders:
quantity = order[0]
price = order[1]
if quantity > MINIMUM_QUANTITY:
marked_up_price = price * PRICE_MARKUP
marked_up_prices.append(marked_up_price)
return marked_up_prices
Meaningful Names¶
Names are the primary way we communicate intent. Good names make code self-documenting.
Intention-Revealing Names¶
Bad:
int d; // elapsed time in days
Good:
int elapsedTimeInDays;
int daysSinceCreation;
int daysSinceModification;
Avoid Disinformation¶
Bad:
const accountList = {}; // Not a list, it's an object!
function hp() { } // What does hp mean?
const theAccounts = []; // "the" adds no information
Good:
const accountMap = {};
function calculateHorsePower() { }
// Or if context is clear:
function calculatePower() { }
const accounts = [];
Use Pronounceable Names¶
Bad:
genymdhms # generation year, month, day, hour, minute, second
Good:
generation_timestamp
Use Searchable Names¶
Bad:
for (int i = 0; i < 7; i++) { // What is 7?
// ...
}
Good:
const int DAYS_IN_WEEK = 7;
for (int day = 0; day < DAYS_IN_WEEK; day++) {
// ...
}
Class Names: Nouns¶
Classes represent things (nouns):
// Good
class Customer { }
class Account { }
class PaymentProcessor { }
// Bad
class Manager { } // Too vague
class Data { } // Meaningless
class Info { } // Meaningless
Method Names: Verbs¶
Methods perform actions (verbs):
# Good
def calculate_total():
pass
def send_email():
pass
def is_valid(): # Boolean: is_, has_, can_
pass
# Bad
def total(): # Noun (ambiguous)
pass
def email(): # Noun
pass
Avoid Encodings¶
Bad (Hungarian notation):
int iCount;
string strName;
bool bIsActive;
Good:
int count;
string name;
bool isActive;
One Word Per Concept¶
Pick one term and stick with it:
Inconsistent:
class UserFetcher { }
class AccountRetriever { }
class ProductGetter { }
Consistent:
class UserRepository { }
class AccountRepository { }
class ProductRepository { }
Avoid Noise Words¶
Bad:
const productData = {};
const productInfo = {};
const theProduct = {};
function getProductData() { }
function getProductInfo() { }
What's the difference between Data and Info? None!
Good:
const product = {};
function getProduct() { }
Functions¶
Small Functions¶
Functions should do one thing, do it well, and do it only.
Bad (does too much):
def process_order(order):
# Validate
if not order.get('customer_id'):
raise ValueError("Missing customer ID")
if not order.get('items'):
raise ValueError("No items")
# Calculate total
total = 0
for item in order['items']:
total += item['price'] * item['quantity']
# Apply discount
if order.get('coupon'):
discount = get_discount(order['coupon'])
total -= total * discount
# Save to database
db.save('orders', order)
# Send confirmation email
send_email(order['customer_id'], f"Order confirmed: ${total}")
return total
Good (single responsibilities):
def process_order(order):
validate_order(order)
total = calculate_order_total(order)
save_order(order)
send_confirmation_email(order, total)
return total
def validate_order(order):
if not order.get('customer_id'):
raise ValueError("Missing customer ID")
if not order.get('items'):
raise ValueError("No items")
def calculate_order_total(order):
subtotal = sum(item['price'] * item['quantity']
for item in order['items'])
discount = get_discount_amount(order, subtotal)
return subtotal - discount
def get_discount_amount(order, subtotal):
if coupon := order.get('coupon'):
discount_rate = get_discount(coupon)
return subtotal * discount_rate
return 0
def save_order(order):
db.save('orders', order)
def send_confirmation_email(order, total):
customer_id = order['customer_id']
message = f"Order confirmed: ${total}"
send_email(customer_id, message)
Ideal Argument Count¶
0 arguments (niladic): Best 1 argument (monadic): Good 2 arguments (dyadic): OK 3 arguments (triadic): Questionable 4+ arguments (polyadic): Avoid
Bad:
public void createUser(String firstName, String lastName, String email,
String phone, String address, String city,
String state, String zipCode) {
// Too many parameters!
}
Good:
public class UserData {
private String firstName;
private String lastName;
private String email;
private String phone;
private Address address;
// Getters and setters...
}
public class Address {
private String street;
private String city;
private String state;
private String zipCode;
}
public void createUser(UserData userData) {
// Single parameter (object)
}
No Side Effects¶
A function should do what its name says, nothing more.
Bad (hidden side effect):
function checkPassword(username, password) {
const user = db.getUser(username);
if (user.password === hash(password)) {
Session.initialize(); // Side effect! Not mentioned in name
return true;
}
return false;
}
Good:
function isPasswordCorrect(username, password) {
const user = db.getUser(username);
return user.password === hash(password);
}
// Caller controls side effects
if (isPasswordCorrect(username, password)) {
Session.initialize();
redirectToHomePage();
}
Command-Query Separation¶
A function should either do something (command) or return something (query), but not both.
Bad (does both):
def set_and_check_attribute(name, value):
"""Sets attribute and returns True if it was changed."""
old_value = get_attribute(name)
set_attribute(name, value)
return old_value != value
# Confusing usage:
if set_and_check_attribute('status', 'active'):
# Did we set it? Or just check it?
pass
Good (separate command and query):
def set_attribute(name, value):
"""Sets attribute (command)."""
attributes[name] = value
def has_attribute_changed(name, new_value):
"""Checks if value would change (query)."""
return get_attribute(name) != new_value
# Clear usage:
if has_attribute_changed('status', 'active'):
set_attribute('status', 'active')
DRY: Don't Repeat Yourself¶
Bad:
public void printUserReport(User user) {
System.out.println("Name: " + user.getFirstName() + " " + user.getLastName());
System.out.println("Email: " + user.getEmail());
System.out.println("Phone: " + user.getPhone());
}
public void emailUserDetails(User user) {
String details = "Name: " + user.getFirstName() + " " + user.getLastName() + "\n";
details += "Email: " + user.getEmail() + "\n";
details += "Phone: " + user.getPhone();
sendEmail(user.getEmail(), details);
}
Good:
public String formatUserDetails(User user) {
return String.format("Name: %s %s\nEmail: %s\nPhone: %s",
user.getFirstName(), user.getLastName(),
user.getEmail(), user.getPhone());
}
public void printUserReport(User user) {
System.out.println(formatUserDetails(user));
}
public void emailUserDetails(User user) {
sendEmail(user.getEmail(), formatUserDetails(user));
}
Code Smells¶
Code smells are indicators of potential problems. They don't always mean bad code, but they deserve attention.
Long Method¶
Smell:
def process_payment(order):
# 200 lines of code...
pass
Refactoring: Extract Method
def process_payment(order):
validate_payment_info(order)
charge_amount = calculate_charge(order)
transaction_id = charge_credit_card(charge_amount, order.card)
update_inventory(order)
send_confirmation(order, transaction_id)
return transaction_id
Large Class (God Class)¶
Smell:
public class User {
// Authentication
public boolean login(String password) { }
public void logout() { }
// Profile management
public void updateProfile() { }
public void uploadAvatar() { }
// Permissions
public boolean hasPermission(String permission) { }
public void grantPermission(String permission) { }
// Email
public void sendWelcomeEmail() { }
public void sendPasswordResetEmail() { }
// Analytics
public void trackLogin() { }
public void trackPageView() { }
// ... 50 more methods
}
Refactoring: Extract Class
public class User {
private UserProfile profile;
private UserAuthentication auth;
private UserPermissions permissions;
// Minimal interface
}
public class UserAuthentication {
public boolean login(String password) { }
public void logout() { }
}
public class UserProfile {
public void update() { }
public void uploadAvatar() { }
}
public class UserPermissions {
public boolean has(String permission) { }
public void grant(String permission) { }
}
Feature Envy¶
A method that uses data/methods from another class more than its own.
Smell:
class Order:
def __init__(self, customer):
self.customer = customer
def get_discount(self):
# Uses customer data extensively
if self.customer.is_premium():
if self.customer.years_active > 5:
return 0.20
else:
return 0.10
else:
return 0.05
Refactoring: Move Method
class Customer:
def get_discount_rate(self):
if self.is_premium():
if self.years_active > 5:
return 0.20
else:
return 0.10
else:
return 0.05
class Order:
def __init__(self, customer):
self.customer = customer
def get_discount(self):
return self.customer.get_discount_rate()
Data Clumps¶
Groups of data that always appear together.
Smell:
function createUser(firstName, lastName, email, street, city, state, zip) {
// ...
}
function updateUser(userId, firstName, lastName, email, street, city, state, zip) {
// ...
}
Refactoring: Introduce Parameter Object
class Address {
constructor(street, city, state, zip) {
this.street = street;
this.city = city;
this.state = state;
this.zip = zip;
}
}
class UserInfo {
constructor(firstName, lastName, email, address) {
this.firstName = firstName;
this.lastName = lastName;
this.email = email;
this.address = address;
}
}
function createUser(userInfo) {
// ...
}
function updateUser(userId, userInfo) {
// ...
}
Shotgun Surgery¶
A single change requires modifications in many classes.
Smell:
// Adding a new payment method requires changes in:
class Order { } // Add payment type check
class PaymentForm { } // Add UI for new payment
class PaymentValidator { } // Add validation
class Receipt { } // Add receipt format
class Analytics { } // Add tracking
Refactoring: Move related changes to one place (Strategy pattern)
interface PaymentMethod {
void process(double amount);
String getReceiptFormat();
void validate();
void track();
}
class CreditCardPayment implements PaymentMethod { }
class PayPalPayment implements PaymentMethod { }
class CryptoPayment implements PaymentMethod { }
// Now adding a new payment method only requires one new class
Divergent Change¶
One class is commonly changed in different ways for different reasons.
Smell:
class Report:
def generate(self):
data = self.fetch_data() # Database logic (reason 1)
formatted = self.format(data) # Formatting logic (reason 2)
self.send_email(formatted) # Email logic (reason 3)
Refactoring: Extract Class (Single Responsibility Principle)
class ReportDataFetcher:
def fetch(self):
# Database logic only
pass
class ReportFormatter:
def format(self, data):
# Formatting logic only
pass
class ReportEmailer:
def send(self, report):
# Email logic only
pass
class ReportGenerator:
def __init__(self, fetcher, formatter, emailer):
self.fetcher = fetcher
self.formatter = formatter
self.emailer = emailer
def generate(self):
data = self.fetcher.fetch()
formatted = self.formatter.format(data)
self.emailer.send(formatted)
Primitive Obsession¶
Using primitives instead of small objects for simple tasks.
Smell:
void sendMessage(std::string phoneNumber) {
// Is phoneNumber valid? No type safety
if (phoneNumber.length() != 10) {
throw std::invalid_argument("Invalid phone");
}
// ...
}
sendMessage("invalid"); // Compiles, but wrong
Refactoring: Replace Data Value with Object
class PhoneNumber {
private:
std::string number;
public:
PhoneNumber(const std::string& num) {
if (num.length() != 10 || !isDigits(num)) {
throw std::invalid_argument("Invalid phone number");
}
number = num;
}
std::string toString() const { return number; }
private:
bool isDigits(const std::string& str) const {
return std::all_of(str.begin(), str.end(), ::isdigit);
}
};
void sendMessage(const PhoneNumber& phoneNumber) {
// Type safety! phoneNumber is guaranteed valid
}
// sendMessage("invalid"); // Compiler error!
sendMessage(PhoneNumber("1234567890")); // OK
Switch Statements¶
Smell:
public double calculatePay(Employee employee) {
switch (employee.type) {
case HOURLY:
return employee.hours * employee.hourlyRate;
case SALARIED:
return employee.monthlySalary;
case COMMISSIONED:
return employee.baseSalary + employee.commission;
default:
throw new IllegalArgumentException("Unknown employee type");
}
}
// Problem: This switch appears in multiple places!
// If we add a new employee type, we must update all switches.
Refactoring: Replace Conditional with Polymorphism
abstract class Employee {
public abstract double calculatePay();
}
class HourlyEmployee extends Employee {
private double hours;
private double hourlyRate;
public double calculatePay() {
return hours * hourlyRate;
}
}
class SalariedEmployee extends Employee {
private double monthlySalary;
public double calculatePay() {
return monthlySalary;
}
}
class CommissionedEmployee extends Employee {
private double baseSalary;
private double commission;
public double calculatePay() {
return baseSalary + commission;
}
}
// Adding a new type only requires a new class
Speculative Generality¶
Smell:
class AbstractFactoryProviderSingleton:
"""Might need this someday..."""
pass
def calculate_with_future_algorithm_support(data, algorithm='current'):
# Supports 5 algorithms we might need in the future
pass
Refactoring: YAGNI (You Aren't Gonna Need It) — delete it!
def calculate(data):
# Implement what you need now
pass
# Add algorithms when actually needed
Dead Code¶
Smell:
function processOrder(order) {
// validateAddress(order.address); // Commented out
calculateTotal(order);
// if (false) { // Always false
// applyOldDiscountLogic(order);
// }
}
Refactoring: Delete it! Version control remembers.
function processOrder(order) {
calculateTotal(order);
}
Refactoring Techniques¶
Extract Method¶
Before:
def print_invoice(invoice):
print("**** Invoice ****")
print(f"Customer: {invoice['customer']}")
print(f"Amount: ${invoice['amount']}")
# Calculate discount
discount = 0
if invoice['amount'] > 1000:
discount = invoice['amount'] * 0.1
print(f"Discount: ${discount}")
print(f"Total: ${invoice['amount'] - discount}")
After:
def print_invoice(invoice):
print_header()
print_customer_info(invoice)
discount = calculate_discount(invoice)
print_amounts(invoice, discount)
def print_header():
print("**** Invoice ****")
def print_customer_info(invoice):
print(f"Customer: {invoice['customer']}")
print(f"Amount: ${invoice['amount']}")
def calculate_discount(invoice):
if invoice['amount'] > 1000:
return invoice['amount'] * 0.1
return 0
def print_amounts(invoice, discount):
print(f"Discount: ${discount}")
print(f"Total: ${invoice['amount'] - discount}")
Rename Variable/Method¶
Before:
public void calc(int d) {
int r = d * 0.1;
System.out.println(r);
}
After:
public void calculateDiscount(int orderAmount) {
int discountAmount = orderAmount * 0.1;
System.out.println(discountAmount);
}
Introduce Explaining Variable¶
Before:
if ((platform.toUpperCase().includes('MAC') ||
platform.toUpperCase().includes('LINUX')) &&
browser.toUpperCase().includes('CHROME')) {
// ...
}
After:
const isMacOrLinux = platform.toUpperCase().includes('MAC') ||
platform.toUpperCase().includes('LINUX');
const isChrome = browser.toUpperCase().includes('CHROME');
if (isMacOrLinux && isChrome) {
// ...
}
Replace Magic Number with Named Constant¶
Before:
double calculateMonthlyPayment(double principal, double rate, int years) {
return principal * rate / 12 / (1 - pow(1 + rate / 12, -years * 12));
}
After:
const int MONTHS_PER_YEAR = 12;
double calculateMonthlyPayment(double principal, double annualRate, int years) {
double monthlyRate = annualRate / MONTHS_PER_YEAR;
int totalMonths = years * MONTHS_PER_YEAR;
return principal * monthlyRate / (1 - pow(1 + monthlyRate, -totalMonths));
}
Move Method¶
Move a method to the class where it belongs.
Before:
class Customer:
def __init__(self, name):
self.name = name
class Order:
def __init__(self, customer, amount):
self.customer = customer
self.amount = amount
def get_discount(self):
# Uses customer data, not order data
if self.customer.name.startswith('VIP'):
return 0.2
return 0.1
After:
class Customer:
def __init__(self, name):
self.name = name
def get_discount_rate(self):
if self.name.startswith('VIP'):
return 0.2
return 0.1
class Order:
def __init__(self, customer, amount):
self.customer = customer
self.amount = amount
def get_discount(self):
return self.amount * self.customer.get_discount_rate()
Principles¶
DRY: Don't Repeat Yourself¶
Every piece of knowledge should have a single, unambiguous representation.
KISS: Keep It Simple, Stupid¶
Simplicity should be a key goal. Avoid unnecessary complexity.
Complex:
class AbstractFactoryBuilderProvider:
@staticmethod
def create_instance_with_dependency_injection(config):
# 50 lines of complex logic
pass
Simple:
def create_user(name, email):
return User(name, email)
YAGNI: You Aren't Gonna Need It¶
Don't implement features until they're actually needed.
YAGNI violation:
class User {
// Just in case we need these someday...
private String backupEmail;
private String alternatePhone;
private String preferredLanguage; // We only support English
private boolean isVerified; // No verification process yet
}
Principle of Least Surprise¶
Code should behave as users expect.
Surprising:
function getUsers() {
const users = db.query('SELECT * FROM users');
users.forEach(u => u.lastAccessed = Date.now()); // Side effect!
db.update(users);
return users;
}
Unsurprising:
function getUsers() {
return db.query('SELECT * FROM users');
}
function updateLastAccessed(users) {
users.forEach(u => u.lastAccessed = Date.now());
db.update(users);
}
Boy Scout Rule¶
"Leave the code better than you found it."
Even small improvements accumulate: - Rename a confusing variable - Extract a long method - Add a missing comment - Delete dead code
Comments¶
Good Comments¶
Legal comments:
// Copyright (C) 2024 Company Inc. All rights reserved.
Informative comments:
# Returns a list of tuples: (user_id, username, last_login)
def get_active_users():
pass
Explanation of intent:
// We use a hash instead of array because lookup needs to be O(1)
std::unordered_map<int, User> userCache;
Clarification:
// Third-party API returns timestamps in milliseconds, not seconds
const timestampSeconds = apiResponse.timestamp / 1000;
TODO comments:
# TODO: Add caching to improve performance (issue #1234)
def fetch_user_data(user_id):
pass
Bad Comments¶
Redundant comments:
// Get the name
public String getName() {
return name;
}
// Increment i
i++;
Misleading comments:
# Returns the user's age
def get_user_info(user_id):
return db.get_user(user_id) # Returns entire user object, not just age!
Noise comments:
/**
* Constructor.
*/
public User() {
}
Commented-out code:
function processOrder(order) {
// validateAddress(order); // Don't comment out, delete it!
// applyDiscount(order);
calculateTotal(order);
}
Journal comments:
// 2024-01-15: Added validation - John
// 2024-01-20: Fixed bug - Sarah
// 2024-02-01: Refactored - Mike
public void processPayment() {
// Use version control for history!
}
Self-Documenting Code¶
Prefer code that explains itself:
Needs comments:
# Check if user is eligible for discount
if u.t == 'premium' and u.yrs > 2:
pass
Self-documenting:
def is_eligible_for_loyalty_discount(user):
return user.type == 'premium' and user.years_active > 2
if is_eligible_for_loyalty_discount(user):
pass
Formatting¶
Consistent formatting improves readability.
Vertical Formatting¶
Newspaper metaphor: Most important information at top.
# Public interface first
class OrderService:
def process_order(self, order):
self._validate(order)
total = self._calculate_total(order)
self._save(order)
return total
# Private helpers below
def _validate(self, order):
pass
def _calculate_total(self, order):
pass
def _save(self, order):
pass
Horizontal Formatting¶
Keep lines short (80-120 characters).
Bad:
public void createUser(String firstName, String lastName, String email, String phone, Address address, List<String> permissions) { }
Good:
public void createUser(
String firstName,
String lastName,
String email,
String phone,
Address address,
List<String> permissions
) {
// ...
}
Team Conventions¶
Use linters and formatters: - JavaScript: ESLint, Prettier - Python: pylint, black - Java: Checkstyle, Google Java Format - C++: clang-format
Summary¶
| Principle | Guideline |
|---|---|
| Names | Reveal intent, avoid disinformation, use pronounceable names |
| Functions | Small, do one thing, few arguments, no side effects |
| Comments | Explain why, not what; prefer self-documenting code |
| DRY | Don't repeat yourself |
| KISS | Keep it simple |
| YAGNI | You aren't gonna need it |
| Boy Scout Rule | Leave code better than you found it |
Code smells to watch for: - Long methods/large classes - Feature envy - Data clumps - Shotgun surgery - Primitive obsession - Switch statements (consider polymorphism) - Dead code
Exercises¶
Exercise 1: Refactor This Code¶
def p(o):
t = 0
for i in o['items']:
t += i['p'] * i['q']
if o.get('c'):
d = 0.1 if o['c'] == 'SAVE10' else 0.2 if o['c'] == 'SAVE20' else 0
t = t - (t * d)
return t
Refactor this to follow clean code principles.
Exercise 2: Identify Code Smells¶
Review this code and identify at least 5 code smells:
public class UserManager {
public void process(String type, String id, String data) {
if (type.equals("create")) {
// 50 lines of user creation logic
} else if (type.equals("update")) {
// 50 lines of user update logic
} else if (type.equals("delete")) {
// 50 lines of user deletion logic
}
}
public String getUserFirstName(String id) {
String[] data = db.query("SELECT * FROM users WHERE id = " + id);
return data[0];
}
public String getUserLastName(String id) {
String[] data = db.query("SELECT * FROM users WHERE id = " + id);
return data[1];
}
public String getUserEmail(String id) {
String[] data = db.query("SELECT * FROM users WHERE id = " + id);
return data[2];
}
}
Exercise 3: Write Clean Functions¶
Rewrite this function to follow the principles of clean code:
function validateAndProcessForm(formData) {
if (!formData.email || !formData.email.includes('@')) {
alert('Invalid email');
return false;
}
if (!formData.password || formData.password.length < 8) {
alert('Password too short');
return false;
}
if (formData.password !== formData.confirmPassword) {
alert('Passwords do not match');
return false;
}
const user = {
email: formData.email,
password: hash(formData.password),
created: Date.now()
};
db.save(user);
sendWelcomeEmail(user.email);
redirectToHomePage();
return true;
}
Exercise 4: Apply Boy Scout Rule¶
Find a file in your codebase and make it slightly better: 1. Rename one poorly-named variable 2. Extract one long method 3. Remove one piece of dead code 4. Add one explanatory variable
Document your changes.
Exercise 5: Code Review Checklist¶
Create a code review checklist based on this lesson. Include at least: - 5 naming checks - 5 function checks - 5 code smell checks - 3 comment checks
Previous: 07_Design_Patterns.md Next: 09_Error_Handling.md