Metrics for CL

For the final evaluation metric, BeGin provides the following metrics. In the mathematical expression, \(\mathrm{M}_{i,j}\) indicates the performance on on j-th task after the i-th task is processed, \(\mathrm{M}^{joint}\) is a basic performance matrix of the joint model, and \(r_i\) denotes the performance of a randomly initialized model on i-th task.

Average Performance (AP)

Average performance on each task after learning all tasks.

\[\mathrm{AP}=\frac{\sum_{i=1}^{N}\mathrm{M}_{N,i}}{N}\]

Average Forgetting (AF)

Average forgetting on each task after learning all tasks. We measure the forgetting on the i-th task by the difference between the performance on the i-th task after learning all tasks and the performance on the i-th task right after learning the i-th task.

\[\mathrm{AF}=\frac{\sum_{i=1}^{N-1}\mathrm{M}_{N,i}-\mathrm{M}_{i,i}}{N}\]

Intransigence (INT)

Average intransigence on each task. We measure the intransigence on the i-th task by the difference between the performances of the Joint model and the target model on the i-th task after learning the i-th task.

\[\mathrm{INT}=\frac{\sum_{i=1}^{N}\mathrm{M}^{Joint}_{i,i}-\mathrm{M}_{i,i}}{N}\]

Forward Transfer (FWT)

Average forward transfer on each task. We measure the forward transfer on the i-th task by the difference between the performance on the i-th task after learning (i−1)-th task and the performance \(r_i\) on the i-th task without any learning.

\[\mathrm{FWT}=\frac{\sum_{i=2}^{N}\mathrm{M}_{i-1,i}-r_{i}}{N}\]