Quick Start Guide
This guide will help you get started with CallMeFair, a comprehensive framework for bias mitigation in AI systems. You’ll learn how to install the framework, load your data, and apply various bias mitigation techniques.
Installation
First, install CallMeFair and its dependencies:
pip install callmefair
Or install from source:
git clone https://github.com/your-repo/callmefair.git
cd callmefair
pip install -e .
Basic Usage
Here’s a simple example that demonstrates the core functionality:
import pandas as pd
import numpy as np
from callmefair.util.fair_util import BMInterface
from callmefair.mitigation.fair_bm import BMManager
# Load your data
# Replace with your actual data files
train_df = pd.read_csv('train.csv')
val_df = pd.read_csv('val.csv')
test_df = pd.read_csv('test.csv')
# Initialize the bias mitigation interface
bm_interface = BMInterface(
df_train=train_df,
df_val=val_df,
df_test=test_df,
label='target', # Your target column name
protected=['gender'] # Your protected attributes
)
# Define privileged and unprivileged groups
privileged_groups = [{'gender': 1}] # Male group
unprivileged_groups = [{'gender': 0}] # Female group
# Create bias mitigation manager
bm_manager = BMManager(
bmI=bm_interface,
privileged_group=privileged_groups,
unprivileged_group=unprivileged_groups
)
# Apply preprocessing bias mitigation
bm_manager.pre_Reweighing()
# Train a model (example with sklearn)
from sklearn.ensemble import RandomForestClassifier
X_train, y_train = bm_interface.get_train_xy()
model = RandomForestClassifier()
model.fit(X_train, y_train)
# Make predictions
X_test, y_test = bm_interface.get_test_xy()
predictions = model.predict(X_test)
# Apply postprocessing bias mitigation
from aif360.datasets import BinaryLabelDataset
# Convert predictions to BinaryLabelDataset format
test_pred_dataset = bm_interface.get_test_BLD()
test_pred_dataset.labels = predictions.reshape(-1, 1)
val_pred_dataset = bm_interface.get_val_BLD()
val_pred_dataset.labels = model.predict(bm_interface.get_val_xy()[0]).reshape(-1, 1)
# Apply Calibrated Equalized Odds postprocessing
mitigated_predictions = bm_manager.pos_CEO(
valid_BLD_pred=val_pred_dataset,
test_BLD_pred=test_pred_dataset
)
Available Bias Mitigation Techniques
CallMeFair supports three categories of bias mitigation techniques:
Preprocessing Techniques
Applied to training data before model training:
Method |
Description |
|---|---|
Reweighing |
Assigns different weights to instances based on group membership |
Disparate Impact Remover (DIR) |
Repairs training data to remove disparate impact |
Learning Fair Representations (LFR) |
Learns fair representations that remove sensitive information |
# Apply preprocessing techniques
bm_manager.pre_Reweighing() # Reweighing
bm_manager.pre_DR('gender') # Disparate Impact Remover
bm_manager.pre_LFR() # Learning Fair Representations
In-processing Techniques
Applied during model training:
Method |
Description |
|---|---|
Adversarial Debiasing |
Trains classifier with adversarial fairness constraint |
MetaFair Classifier |
Uses meta-learning for fair classification |
# Apply in-processing techniques
ad_model = bm_manager.in_AD(debias=True) # Adversarial Debiasing
meta_model = bm_manager.in_Meta('gender', tau=0.1) # MetaFair
Postprocessing Techniques
Applied to model predictions after training:
Method |
Description |
|---|---|
Calibrated Equalized Odds (CEO) |
Adjusts predictions for equalized odds with calibration |
Equalized Odds (EO) |
Adjusts predictions for equalized odds |
Reject Option Classification (ROC) |
Rejects uncertain predictions to improve fairness |
# Apply postprocessing techniques
mitigated_ceo = bm_manager.pos_CEO(valid_pred, test_pred)
mitigated_eo = bm_manager.pos_EO(valid_pred, test_pred)
mitigated_roc = bm_manager.pos_ROC(valid_pred, test_pred)
Evaluation and Metrics
CallMeFair provides comprehensive evaluation tools:
from callmefair.util.fair_util import BMMetrics
# Create metrics evaluator
metrics = BMMetrics(
bmI=bm_interface,
class_array=np.array([0, 1]), # Class labels
pred_val=val_pred_dataset,
pred_test=test_pred_dataset,
privileged_group=privileged_groups,
unprivileged_group=unprivileged_groups
)
# Get comprehensive fairness report
report = metrics.get_report()
print("Fairness Report:")
print(f"Statistical Parity Difference: {report['spd']:.4f}")
print(f"Equalized Odds Difference: {report['eq_opp_diff']:.4f}")
print(f"Average Odds Difference: {report['avg_odd_diff']:.4f}")
print(f"Disparate Impact: {report['disparate_impact']:.4f}")
print(f"Theil Index: {report['theil_idx']:.4f}")
# Get overall fairness score
score_dict = metrics.get_score()
print(f"Overall Fairness Score: {score_dict['overall_score']:.4f}")
print(f"Is Fair: {score_dict['is_fair']}")
Fairness Score Calculation
The calculate_fairness_score function aggregates multiple fairness metrics:
from callmefair.util.fair_util import calculate_fairness_score
# Calculate fairness score from individual metrics
fairness_result = calculate_fairness_score(
EOD=0.05, # Equal Opportunity Difference
AOD=0.03, # Average Odds Difference
SPD=0.08, # Statistical Parity Difference
DI=0.95, # Disparate Impact
TI=0.12 # Theil Index
)
print(f"Overall fairness score: {fairness_result['overall_score']}")
print(f"Is fair: {fairness_result['is_fair']}")
# Check individual metric evaluations
for metric, is_acceptable in fairness_result['metric_evaluations'].items():
status = "✓" if is_acceptable else "✗"
print(f"{metric}: {status}")
Advanced Usage
Bias Search and Evaluation
CallMeFair provides comprehensive bias search functionality to identify and evaluate bias in your datasets and models:
from callmefair.search.fair_search import BiasSearch
import pandas as pd
# Initialize bias search with multiple sensitive attributes
searcher = BiasSearch(df, 'target', ['gender', 'race', 'age'])
# Evaluate individual attributes
table, printable = searcher.evaluate_average(iterate=10, model_name='lr')
print("Individual Attribute Bias:")
print(printable)
# Evaluate attribute combinations (2-way and 3-way)
table, printable = searcher.evaluate_combinations()
print("Attribute Combination Bias:")
print(printable)
# Compare different set operations between attributes
table, printable = searcher.evaluate_combination_average('gender', 'race')
print("Set Operation Comparison:")
print(printable)
Grid Search for Bias Mitigation Combinations
CallMeFair provides a comprehensive grid search framework for systematically evaluating different combinations of bias mitigation techniques:
from callmefair.mitigation.fair_grid import BMGridSearch
from callmefair.mitigation.fair_bm import BMType
from sklearn.ensemble import RandomForestClassifier
# Define bias mitigation combinations to test
bm_combinations = [
[BMType.preReweighing], # Only preprocessing
[BMType.preDisparate], # Only disparate impact remover
[BMType.preReweighing, BMType.posCEO], # Preprocessing + postprocessing
[BMType.inAdversarial], # Only in-processing
[BMType.preLFR, BMType.posEO] # Complex combination
]
# Initialize grid search
grid_search = BMGridSearch(
bmI=bm_interface,
model=RandomForestClassifier(),
bm_list=bm_combinations,
privileged_group=privileged_groups,
unprivileged_group=unprivileged_groups
)
# Run comprehensive evaluation
grid_search.run_single_sensitive()
# Results are automatically logged to CSV files
Combining Multiple Techniques
You can combine different bias mitigation techniques manually:
# Apply preprocessing
bm_manager.pre_Reweighing()
# Train model with in-processing
ad_model = bm_manager.in_AD(debias=True)
ad_model.fit(X_train, y_train)
# Apply postprocessing
mitigated_predictions = bm_manager.pos_CEO(valid_pred, test_pred)
Custom Evaluation
Create custom evaluation scenarios:
# Evaluate before bias mitigation
original_metrics = BMMetrics(...)
original_report = original_metrics.get_report()
# Apply bias mitigation
bm_manager.pre_Reweighing()
# Evaluate after bias mitigation
mitigated_metrics = BMMetrics(...)
mitigated_report = mitigated_metrics.get_report()
# Compare results
improvement = {
'SPD': original_report['SPD'] - mitigated_report['SPD'],
'EOD': original_report['EOD'] - mitigated_report['EOD'],
'AOD': original_report['AOD'] - mitigated_report['AOD']
}
print("Improvement in fairness metrics:", improvement)
Best Practices
Data Preparation - Ensure your data is properly split into train/validation/test sets - Identify all protected attributes in your dataset - Handle missing values appropriately
Group Definition - Clearly define privileged and unprivileged groups - Consider intersectional fairness (multiple protected attributes) - Document your group definitions for reproducibility
Technique Selection - Start with preprocessing techniques for simplicity - Use in-processing for better performance when possible - Apply postprocessing as a final fairness adjustment
Evaluation - Always evaluate both accuracy and fairness metrics - Use multiple fairness metrics for comprehensive assessment - Consider the trade-off between accuracy and fairness
Validation - Use cross-validation for robust evaluation - Test on multiple datasets when possible - Document your experimental setup
Troubleshooting
Common Issues and Solutions
- Import Errors
Ensure all dependencies are installed: pip install -r requirements.txt
Check Python version compatibility (3.8+)
- Data Format Issues
Ensure your DataFrame has the correct column names
Verify that protected attributes are properly encoded
Check that target variable is binary (0/1)
- Memory Issues
Use smaller batch sizes for large datasets
Consider data sampling for initial experimentation
Use efficient data structures (numpy arrays instead of lists)
- Fairness Metrics
Ensure groups are properly defined
Check for sufficient samples in each group
Verify that predictions are in the correct format
Next Steps
Now that you’ve completed the quick start guide, you can:
Explore the API Reference: Learn about all available classes and methods
Read the Theory Guide: Understand the mathematical foundations
Try the Examples: Work through comprehensive examples
Contribute: Help improve the framework
For more advanced usage, see the ../user_guide/bias_mitigation_guide and ../user_guide/evaluation_guide.