Bias Search Guide
The bias search functionality in CallMeFair provides comprehensive tools for evaluating bias in machine learning models with respect to sensitive attributes. This guide covers the core search modules and their usage.
Overview
The bias search framework consists of two main modules:
_search_base.py: Core functionality for bias evaluation and model trainingfair_search.py: High-level interface for comprehensive bias analysis
The framework supports:
Individual attribute bias evaluation
Attribute combination analysis (2-way and 3-way combinations)
Multiple set operations (union, intersection, differences)
Various ML models (Logistic Regression, CatBoost, XGBoost, MLP)
Parallel processing for efficient evaluation
Pretty table output for results
Core Classes
BaseSearch
The BaseSearch class provides the foundation for bias evaluation:
from callmefair.search._search_base import BaseSearch
# Initialize with dataset and target variable
searcher = BaseSearch(df, 'target')
# Evaluate bias for a specific attribute
results = searcher.evaluate_attribute('gender', iterate=10, model_name='lr')
Key Methods:
evaluate_attribute(): Evaluate bias for a single attribute__pre_attribute_bias(): Prepare datasets for evaluation__predict_attribute_bias(): Compute fairness metrics
BiasSearch
The BiasSearch class extends BaseSearch with comprehensive analysis capabilities:
from callmefair.search.fair_search import BiasSearch
# Initialize with multiple attributes
searcher = BiasSearch(df, 'target', ['gender', 'race', 'age'])
# Evaluate individual attributes
table, printable = searcher.evaluate_average()
# Evaluate attribute combinations
table, printable = searcher.evaluate_combinations()
Key Methods:
evaluate_average(): Evaluate all individual attributesevaluate_combinations(): Evaluate 2-way and 3-way combinationsevaluate_combination_average(): Compare different set operations
Attribute Combination Operations
The framework supports various set operations for combining sensitive attributes:
Union (OR)
Combines attributes using logical OR operation:
from callmefair.search._search_base import CType, combine_attributes
# gender OR race (either attribute is 1)
result_df = combine_attributes(df, 'gender', 'race', CType.union)
Intersection (AND)
Combines attributes using logical AND operation:
# gender AND race (both attributes are 1)
result_df = combine_attributes(df, 'gender', 'race', CType.intersection)
Set Differences
Computes set differences between attributes:
# gender - race (gender=1 AND race=0)
result_df = combine_attributes(df, 'gender', 'race', CType.difference_1_minus_2)
# race - gender (race=1 AND gender=0)
result_df = combine_attributes(df, 'gender', 'race', CType.difference_2_minus_1)
Symmetric Difference (XOR)
Combines attributes using XOR operation:
# gender XOR race (exactly one attribute is 1)
result_df = combine_attributes(df, 'gender', 'race', CType.symmetric_difference)
Supported Models
The framework supports multiple machine learning models:
Logistic Regression
Fast and interpretable model for bias evaluation:
results = searcher.evaluate_attribute('gender', model_name='lr')
CatBoost
Gradient boosting model with optimized parameters:
results = searcher.evaluate_attribute('gender', model_name='cat')
XGBoost
Advanced gradient boosting with balanced parameters:
results = searcher.evaluate_attribute('gender', model_name='xgb')
Multi-layer Perceptron
Neural network with adaptive learning:
results = searcher.evaluate_attribute('gender', model_name='mlp')
Usage Examples
Individual Attribute Evaluation
Evaluate bias for individual sensitive attributes:
from callmefair.search.fair_search import BiasSearch
# Initialize searcher
searcher = BiasSearch(df, 'target', ['gender', 'race', 'age'])
# Evaluate all attributes
table, printable = searcher.evaluate_average(iterate=10, model_name='lr')
print(printable)
Attribute Combinations
Evaluate bias for attribute combinations:
# Evaluate 2-way and 3-way combinations
table, printable = searcher.evaluate_combinations()
print(printable)
Set Operation Comparison
Compare different ways of combining attributes:
# Compare all set operations between gender and race
table, printable = searcher.evaluate_combination_average('gender', 'race')
print(printable)
Advanced Usage
Custom Dataset Preparation
Use custom datasets for evaluation:
# Use modified dataset
modified_df = df.copy()
modified_df['new_feature'] = some_transformation(modified_df)
results = searcher.evaluate_attribute('gender', df_new=modified_df)
Handling Class Imbalance
Apply NearMiss undersampling for imbalanced datasets:
# Apply class balancing
results = searcher.evaluate_attribute('gender', treat_umbalance=True)
Parallel Processing
The framework automatically uses parallel processing for certain models:
# Logistic Regression and MLP use multiprocessing
results = searcher.evaluate_attribute('gender', model_name='lr') # Parallel
# CatBoost and XGBoost use sequential processing
results = searcher.evaluate_attribute('gender', model_name='cat') # Sequential
Output Interpretation
Fairness Scores
The framework computes two types of fairness scores:
Raw Score: Direct fairness metric value
Overall Score: Normalized fairness score (0-1 scale)
Higher scores indicate better fairness (less bias).
Result Tables
Results are presented in pretty tables with columns:
Attribute: Name of the sensitive attribute or combination
Raw Fairness Score: Direct fairness metric value
Normalized Fairness Score: Normalized score (0-1)
Example Output:
+----------+---------------------+------------------------+
| Attribute| Raw Fairness Score | Normalized Fairness |
| | | score |
+----------+---------------------+------------------------+
| gender | 0.85 | 0.92 |
| race | 0.72 | 0.78 |
| age | 0.91 | 0.95 |
+----------+---------------------+------------------------+
Best Practices
Multiple Iterations: Use at least 5-10 iterations for robust results
Model Selection: Start with Logistic Regression for interpretability
Attribute Combinations: Use intersection for most meaningful combinations
Class Balancing: Apply NearMiss for highly imbalanced datasets
Parallel Processing: Use ‘lr’ or ‘mlp’ models for faster processing
Troubleshooting
Common Issues
Binary Attributes Required: All sensitive attributes must be binary (0 or 1)
# Convert categorical to binary
df['gender'] = (df['gender'] == 'male').astype(int)
Memory Issues: Reduce iterations or use smaller datasets
# Use fewer iterations
results = searcher.evaluate_attribute('gender', iterate=5)
Slow Performance: Use parallel models or reduce dataset size
# Use Logistic Regression for speed
results = searcher.evaluate_attribute('gender', model_name='lr')
Performance Tips
Use Logistic Regression for quick prototyping
Apply class balancing only when necessary
Use parallel processing for large datasets
Consider feature scaling for better model performance
Cache results for repeated evaluations
Integration with Other Modules
The bias search functionality integrates with other CallMeFair modules:
Bias Mitigation: Use search results to identify which attributes need mitigation
Grid Search: Combine with bias mitigation techniques
Utilities: Use fairness score calculation from fair_util
from callmefair.search.fair_search import BiasSearch
from callmefair.bm import BMManager
# Identify bias
searcher = BiasSearch(df, 'target', ['gender', 'race'])
table, printable = searcher.evaluate_average()
# Apply mitigation
bm_manager = BMManager()
mitigated_df = bm_manager.apply_mitigation(df, 'reweighing', 'gender')