astrodata.ml.models package

Submodules

astrodata.ml.models.BaseMlModel module

class astrodata.ml.models.BaseMlModel.BaseMlModel

Bases: ABC

Abstract base class for machine learning models. Defines the standard interface expected from all models.

clone()

Create a (shallow) clone of this model instance.

Returns:

Cloned model instance.

Return type:

BaseMlModel

abstractmethod fit(X, y, **kwargs)

Fit the model to the training data.

Parameters:
  • X (Any) – Training data features.

  • y (Any) – Training data targets.

  • **kwargs – Additional fit options.

Returns:

Returns self.

Return type:

BaseMlModel

abstractmethod get_metrics(X_test, y_test, **kwargs)

Compute and return model metrics on test data.

Parameters:
  • X_test (Any) – Test data features.

  • y_test (Any) – Test data targets.

  • **kwargs – Additional metrics options.

Returns:

Dictionary of metric names and values.

Return type:

dict

get_params(**kwargs)

Get hyperparameters of the model.

Returns:

Model hyperparameters.

Return type:

dict

Raises:

NotImplementedError – If not implemented by the subclass.

abstractmethod get_scorer_metric()

Returns the score function default metric.

Return type:

BaseMetric

abstractmethod load(filepath, **kwargs)

Load the model from the given filepath.

Parameters:
  • filepath (str) – Path to load the model from.

  • **kwargs – Additional load options.

Returns:

The loaded model instance.

Return type:

BaseMlModel

abstractmethod predict(X, **kwargs)

Predict using the trained model.

Parameters:
  • X (Any) – Input data.

  • **kwargs – Additional options for prediction.

Returns:

Model predictions.

Return type:

Any

abstractmethod save(filepath, **kwargs)

Save the model to the given filepath.

Parameters:
  • filepath (str) – Path to save the model.

  • **kwargs – Additional save options.

Return type:

None

abstractmethod score(X, y, **kwargs)

Compute the model score on test data.

Parameters:
  • X (Any) – Test data features.

  • y (Any) – Test data targets.

  • **kwargs – Additional options for scoring.

Returns:

Model score.

Return type:

float

set_params(**kwargs)

Set hyperparameters of the model.

Parameters:

**kwargs – Model hyperparameters.

Raises:

NotImplementedError – If not implemented by the subclass.

Return type:

None

astrodata.ml.models.PytorchModel module

astrodata.ml.models.SklearnModel module

class astrodata.ml.models.SklearnModel.SklearnModel(model_class, random_state=3170578733, **model_params)

Bases: BaseMlModel

A wrapper class for scikit-learn models to standardize the interface and add extended functionality.

fit(X, y, **fit_params)

Fit the model to data.

Parameters:
  • X (array-like) – Training data features.

  • y (array-like) – Training data targets.

  • **fit_params – Additional fit parameters for the underlying model.

Returns:

The fitted SklearnModel instance.

Return type:

self

get_loss_history()

Get the loss history during training, if available.

Returns:

Loss or score history.

Return type:

array-like

Raises:
  • RuntimeError – If the model is not fitted yet.

  • AttributeError – If the model does not have ‘train_score_’ attribute.

get_loss_history_metrics(X=None, y=None, metrics=None)

Get the evolution of a metric over training stages.

Parameters:
  • X (array-like, optional) – Data features to use for staged predictions.

  • y (array-like, optional) – True labels for X.

  • metrics (List[BaseMetric], optional) – Metrics to compute for each stage.

Returns:

List of metric values for each training stage.

Return type:

list

Raises:
  • RuntimeError – If the model is not fitted yet.

  • ValueError – If X or y are not provided.

  • AttributeError – If the model does not support staged predictions.

get_metrics(X, y, metrics=None)

Compute metrics for the given data.

Parameters:
  • X (array-like) – Data features.

  • y (array-like) – Data targets.

  • metrics (list of BaseMetric) – List of metrics to compute.

Returns:

Mapping from metric names to metric values.

Return type:

dict

get_params(**kwargs)

Get parameters for this model.

Returns:

Dictionary containing the model class and its parameters.

Return type:

dict

get_scorer_metric()

Returns the default metric used by the model’s scoring function.

Return type:

SklearnMetric

property has_loss_history: bool

Check if the underlying model supports loss history.

Returns:

True if loss history is available, False otherwise.

Return type:

bool

load(filepath, **kwargs)

Load a model from file.

Parameters:
  • filepath (str) – Path to file where the model is stored.

  • **kwargs – Additional arguments passed to joblib.load.

Return type:

None

predict(X, **predict_params)

Predict targets for samples in X.

Parameters:
  • X (array-like) – Input features.

  • **predict_params – Additional parameters for the underlying model’s predict method.

Returns:

Predicted values.

Return type:

pandas.Series

Raises:

RuntimeError – If the model is not fitted yet.

predict_proba(X, **predict_params)

Predict class probabilities for samples in X.

Parameters:
  • X (array-like) – Input features.

  • **predict_params – Additional parameters for the underlying model’s predict_proba method.

Returns:

Predicted probabilities.

Return type:

pandas.DataFrame

Raises:
  • RuntimeError – If the model is not fitted yet.

  • AttributeError – If the model does not support predict_proba.

save(filepath, **kwargs)

Save the fitted model to a file.

Parameters:
  • filepath (str) – Path to file where the model will be saved.

  • **kwargs – Additional arguments passed to joblib.dump.

Raises:

RuntimeError – If the model is not fitted.

Return type:

None

score(X, y, **kwargs)

Return the score of the model on the given test data and labels.

Parameters:
  • X (array-like) – Test data features.

  • y (array-like) – True labels for X.

  • **kwargs – Additional arguments for the underlying model’s score method.

Returns:

Score of the model.

Return type:

float

Raises:

RuntimeError – If the model is not fitted yet.

set_params(**params)

Set parameters for this model.

Parameters:

**params (dict) – Parameters to set for the model.

Returns:

The updated SklearnModel instance.

Return type:

self

astrodata.ml.models.TensorflowModel module

astrodata.ml.models.XGBoostModel module

class astrodata.ml.models.XGBoostModel.XGBoostModel(model_class, random_state=3234787177, **model_params)

Bases: BaseMlModel

Wrapper for XGBoost models, providing a standardized interface and additional utilities.

fit(X, y, **fit_params)

Fit the XGBoost model.

Parameters:
  • X (array-like) – Training features.

  • y (array-like) – Training targets.

  • **fit_params – Additional parameters to pass to model.fit(). If “eval_set” is not provided, it will default to the training data.

Returns:

self – Fitted model.

Return type:

XGBoostModel

get_loss_history()

Get the training loss history from the last fit.

Returns:

Loss history for the first metric in the evals_result.

Return type:

list or array-like

Raises:

AttributeError – If no loss history is available.

get_loss_history_metrics(X=None, y=None, metrics=None)

Get the evolution of a custom metric over boosting rounds.

Parameters:
  • X (array-like, optional) – Data features to use for staged predictions.

  • y (array-like, optional) – True labels for X.

  • metrics (List[BaseMetric], optional) – Metrics to compute for each boosting stage.

Returns:

List of metric values for each boosting round.

Return type:

list

Raises:
  • RuntimeError – If the model is not fitted yet.

  • ValueError – If X or y are not provided when a metric is requested.

  • AttributeError – If no loss history is available.

get_metrics(X, y, metrics=None)

Compute metrics for the given data.

Parameters:
  • X (array-like) – Data features.

  • y (array-like) – Data targets.

  • metrics (list of BaseMetric) – List of metrics to compute.

Returns:

Mapping from metric names to metric values.

Return type:

dict

get_params(**kwargs)

Get parameters for this model.

Returns:

Dictionary containing the model class and its parameters.

Return type:

dict

get_scorer_metric()

Returns the default metric used by the model’s scoring function.

Return type:

SklearnMetric

property has_loss_history: bool

Check if loss history is available from the last fit.

Returns:

True if loss history is available, False otherwise.

Return type:

bool

load(filepath, **kwargs)

Load a model from file.

Parameters:
  • filepath (str) – Path to file where the model is stored.

  • **kwargs – Additional arguments passed to joblib.load.

Return type:

None

predict(X, **predict_params)

Predict targets for samples in X.

Parameters:
  • X (array-like) – Input features.

  • **predict_params – Additional parameters for the underlying model’s predict method.

Returns:

Predicted values.

Return type:

pandas.Series

Raises:

RuntimeError – If the model is not fitted yet.

save(filepath, **kwargs)

Save the fitted model to a file.

Parameters:
  • filepath (str) – Path to the file where the model will be saved.

  • **kwargs – Additional arguments passed to joblib.dump.

Raises:

RuntimeError – If the model has not been fitted.

Return type:

None

score(X, y, **kwargs)

Return the score of the model on the given test data and labels.

Parameters:
  • X (array-like) – Test data features.

  • y (array-like) – True labels for X.

  • **kwargs – Additional arguments for the underlying model’s score method.

Returns:

Score of the model.

Return type:

float

Raises:

RuntimeError – If the model is not fitted yet.

set_params(**params)

Set parameters for this model.

Parameters:

**params (dict) – Parameters to set for the model.

Returns:

self – The updated model instance.

Return type:

XGBoostModel

Module contents