GEM

Gradient Episodic Memory (GEM) is a replay-based continual learning method which stores data from previous tasks and prevents the increase of losses on them while learning a new task. For the details, see the original paper.

Node-level Problems

class NCClassILGEMMinibatchTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class NCClassILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class NCDomainILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]: This trainer has the same behavior as NCClassILGEMTrainer.

class NCTaskILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

inference(model, _curr_batch, training_states)[source]

The event function to execute inference step.

For task-IL, we need to additionally consider task information for the inference step.

Parameters:

model (torch.nn.Module) – the current trained model.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the inference results, such as prediction result and loss.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class NCTimeILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]: This trainer has the same behavior as NCClassILGEMTrainer.

Link-level Problems

class LCClassILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

” The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class LCTaskILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

inference(model, _curr_batch, training_states)[source]

The event function to execute inference step.

For task-IL, we need to additionally consider task information for the inference step.

Parameters:

model (torch.nn.Module) – the current trained model.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the inference results, such as prediction result and loss.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

prepareLoader(_curr_dataset, curr_training_states)[source]

The event function to generate dataloaders from the given dataset for the current task.

For task-IL, we need to additionally consider task information for the inference step.

Parameters:

curr_dataset (object) – The dataset for the current task. Its type is dgl.graph for node-level and link-level problem, and dgl.data.DGLDataset for graph-level problem.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A tuple containing three dataloaders. The trainer considers the first dataloader, second dataloader, and third dataloader as dataloaders for training, validation, and test, respectively.

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class LCTimeILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

processBeforeTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes before training.

We need to extend the base function since the output format is slightly different from the base trainer.

Parameters:

task_id (int) – the index of the current task
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

processEvalIteration(model, _curr_batch)[source]

The event function to handle every evaluation iteration.

We need to extend the function since the output format is slightly different from the base trainer.

Parameters:

model (torch.nn.Module) – the current trained model.
curr_batch (object) – the data (or minibatch) for the current iteration.

Returns:

A dictionary containing the outcomes (stats) during the evaluation iteration.

class LPDomainILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]: This trainer has the same behavior as LPTimeILGEMTrainer.

class LPTimeILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

processBeforeTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes before training.

GEM performs initialization (for every task) to manage the memory

Parameters:

task_id (int) – the index of the current task
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

processTrainIteration(model, optimizer, _curr_batch, training_states)[source]

The event function to handle every training iteration.

GEM computes the gradients for the previous tasks using the sampled data stored in the memory. Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the outcomes (stats) during the training iteration.

Graph-level Problems

class GCClassILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

processBeforeTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes before training.

GEM performs initialization (for every task) to manage the memory

Parameters:

task_id (int) – the index of the current task
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class GCDomainILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]: This trainer has the same behavior as GCClassILGEMTrainer.

class GCTaskILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]

afterInference(results, model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right after the inference step (for training). We recommend performing backpropagation in this event function.

Using the computed gradients from the samples, GEM controls the gradients for the current task with quadratic programming.

Parameters:

results (dict) – the returned dictionary from the event function inference.
model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the information from the results.

beforeInference(model, optimizer, _curr_batch, training_states)[source]

The event function to execute some processes right before inference (for training).

GEM computes the gradients for the previous tasks using the sampled data stored in the memory.

Parameters:

model (torch.nn.Module) – the current trained model.
optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

inference(model, _curr_batch, training_states)[source]

The event function to execute inference step.

For task-IL, we need to additionally consider task information for the inference step.

Parameters:

model (torch.nn.Module) – the current trained model.
curr_batch (object) – the data (or minibatch) for the current iteration.
curr_training_states (dict) – the dictionary containing the current training states.

Returns:

A dictionary containing the inference results, such as prediction result and loss.

initTrainingStates(scenario, model, optimizer)[source]

The event function to initialize the dictionary for storing training states (i.e., intermedeiate results).

Parameters:

scenario (begin.scenarios.common.BaseScenarioLoader) – the given ScenarioLoader to the trainer
model (torch.nn.Module) – the given model to the trainer
optmizer (torch.optim.Optimizer) – the optimizer generated from the given optimizer_fn

Returns:

Initialized training state (dict).

processAfterTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes after training the current task.

GEM samples the instances in the training dataset for computing gradients in beforeInference() (or processTrainIteration()) for the future tasks.

Parameters:

task_id (int) – the index of the current task.
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

processBeforeTraining(task_id, curr_dataset, curr_model, curr_optimizer, curr_training_states)[source]

The event function to execute some processes before training.

GEM performs initialization (for every task) to manage the memory

Parameters:

task_id (int) – the index of the current task
curr_dataset (object) – The dataset for the current task.
curr_model (torch.nn.Module) – the current trained model.
curr_optimizer (torch.optim.Optimizer) – the current optimizer function.
curr_training_states (dict) – the dictionary containing the current training states.

class GCTimeILGEMTrainer(model, scenario, optimizer_fn, loss_fn, device, **kwargs)[source]: This trainer has the same behavior as GCClassILGEMTrainer.