Smoothing Accelerated Proximal Gradient Method with Fast Convergence Rate for Nonsmooth Multi-objective Optimization
Abstract.
This paper proposes a Smoothing Accelerated Proximal Gradient Method with Extrapolation Term (SAPGM) for nonsmooth multiobjective optimization. By combining the smoothing methods and the accelerated algorithm for multiobjective optimization by Tanabe et al., our method achieve fast convergence rate. Specifically, we establish that the convergence rate of our proposed method can be enhanced to by incorporating a extrapolation term with .Moreover, we prove that the iterates sequence is convergent to a Pareto optimal solution of the primal problem. Furthermore, we present an effective strategy for solving the subproblem through its dual representation, validating the efficacy of the proposed method through a series of numerical experiments.
Key words and phrases:
Nonsmooth multiobjective optimization, Smoothing method, Accelerated algorithm with extrapolation, Convergence rate, Sequential convergence.1991 Mathematics Subject Classification:
Primary: 49J52, 65K05; Secondary: 90C25, 90C29.Huang Chengzhi and First-name2 Last-name2
1Chong qing normal university, China
2Affiliation, Country
(Communicated by Handling Editor)
1. Introduction
Multiobjective optimization involves the simultaneous minimization (or maximization) of multiple objective functions while considering relevant constraints. The concept of Pareto optimality becomes crucial, as finding a single point that minimizes all objective functions concurrently is challenging. A point is deemed Pareto optimal if there exists no other point with the same or smaller objective function values and at least one strictly smaller objective function value. Applications of multiobjective optimization are pervasive, spanning economics[9], engineering[21], mechanics[37], statistics[41], internet routing[12], and location problems[3].
This paper focuses predominantly on composite nonsmooth multiobjective optimization, expressed as:
(1) |
with and taking the form
(2) |
where, represents a convex but nonsmooth function, and is a closed, proper, and convex function,which may not be nonsmooth.
The composite optimization problem is a significant class of optimization problems, not only because it encompasses various practical challenges—such as minimax problems [49] and penalty methods for constrained optimization [48]—but also due to its wide range of applications. For instance, as discussed in [44], the separable structure in (1) can be used to model robust multi-objective optimization problems, which involve uncertain parameters and optimize for the worst-case scenario. Additionally, this structure is applicable in machine learning[25], particularly for solving multi-objective clustering problems.
Naturally, we are interested in methods for solving multi-objective optimization problems. Common methods include scalarization method, evolutionary method and gradient method.
Scalarization is a fundamental approach to solve multiobjective optimization problems, transforming them into single-objective ones. Various procedures, such as optimizing one objective while treating others as constraints[36], or aggregating all objectives[39], are commonly applied. Evolution algorithms[46] provide another avenue, but proving their convergence rate poses challenges. Consequently, traditional methods for solving the problem directly are also employed.
In response to limitations, descent methods for multiobjective optimization problems have gained significant attention. These algorithms, which reduce all objective functions at each iteration, offer advantages such as not requiring prior parameter selection and providing convergence guarantees under reasonable assumptions. Noteworthy methods include the steepest descent[15], projected gradient[17], proximal point[4], Newton[16], trust region[5], and conjugate gradient methods[27] for solving . Among these, first-order methods, utilizing only the first-order derivatives of the objective functions, are distinguished, such as the steepest descent, projected gradient, and proximal gradient methods. The latter method converges to Pareto solutions with a rate of O(1/k).
To enhance the convergence efficiency of the proximal gradient method, numerous scholars have endeavored to introduce acceleration techniques into single-objective first-order methodologies. Detailed works can be seen in the following literature:[33][6][7][1].
The application of acceleration algorithms in single objective scenarios prompted a significant surge in interest in exploring their efficacy in the realm of multi-objective optimization problems. A recent noteworthy development by Tanabe et al. [42] involves the extension of the highly regarded Fast Iterative Shrinkage-Thresholding Algorithm (FISTA) to the multi-objective context. The ensuing convergence rate, denoted as and characterized by a merit function [43], represents a substantial improvement over the proximal gradient method for Multi-Objective Problems (MOP) [44]. Moreover, Nishimura et al. [35] have established a monotonicity version of the multiobjective FISTA, adding to the methodological advancements in this domain. Furthermore, Tanabe et al. [45] have expanded the applicability of the multiobjective FISTA by introducing hyperparameters, offering a generalization applicable even in single-objective scenarios. Importantly, this extended framework preserves the commendable convergence rate of observed in the multiobjective FISTA. Additionally, it is proved that the iterative sequences is convergent.Inspired by the impact of the extrapolation parameters in single-objective case[1], we introduce the extrapolation parameter with into the multiobjective proximal gradient algorithm.
After solving the problem of algorithm acceleration, another problem follows: how to deal with non-smooth multi-objective optimization efficiently?
For non-smooth multi-objective optimization problems, current research mainly includes Mäkelä et al.’s proximal bundle method [31, 32, 20, 30] and the subgradient method. Gebken et al. [18] proposed a subgradient descent algorithm for solving non-smooth multi-objective optimization problems by combining the descent direction from [29] with the approximation based on Goldstein’s -subdifferential from [19]. Besides, Konstantin Sonntag et al. [26] proposed a new subgradient method for solving non-smooth vector optimization problems, which includes regularization and interior point variants of Newton’s method. However, all of them face the challenge of requiring complex calculations and numerous calls to subgradients, resulting in a significant increase in computation time. Fortunately, Chen [8] proposed a smoothing construction, which used a sequence of functions to approximate the objective functions of the primal problem. This construction can avoid calculating the subgradient and directly use the gradient of the smoothing function to obtain the result. Inspired by this idea, we decided to construct an algorithm with fast speed to solve non-smooth multi-objective optimization problems under the smoothing framework, combined with the previously mentioned accelerated proximal gradient method with extrinsic terms.
Moreover, with practical computational efficiency in mind, we derive a convex and differentiable dual of the subproblem, simplifying its solution, particularly when the number of objective functions is fewer than the decision variable dimension. The entire algorithm is implemented using this dual problem, and its effectiveness is confirmed through numerical experiments.
The structure of this paper unfolds as follows: Section 2 introduces notations and concepts, Section 3 presents the smoothing accelerated proximal gradient method with extrapolation for nonsmooth multiobjective optimization, and Section 4 analyzes its convergence rate. Section 5 outlines an efficient method to solve the subproblem through its dual form, and Section 6 reports numerical results for test problems.
2. Preliminary Results
In this paper, for any natural number , the symbol denotes the -dimensional Euclidean space. The notation is employed to signify the non-negative orthant of , denoted as . Additionally, represents the standard simplex in and is defined as
Subsequently, the partial orders induced by are considered, where for any , (alternatively, ) holds if , and (alternatively, ) if . Moreover, let denote the Euclidean inner product in , specifically defined as . The Euclidean norm is introduced as . Furthermore, the -norm and the -norm are defined by and , respectively.
Because the construction of proximal gradient algorithm,we should introduce some basic definitions for following discussion.For a closed , proper and convex function ,the Moreau envelope of defined by
The unique solution of the above problem is called the proximal operator of and write it as
Next,we introduce a property between Moreau envelope and proximal operation by following lemma.
Lemma 2.1 ([38]).
If is a proper closed and convex function, the Moreau envelope is lipschitz continuous and takes the following form,
As explicated in the Introduction section, the principal challenge in addressing the optimization problem denoted as (1) through the Proximal Gradient (PG) and Accelerated Proximal Gradient (APG) methods comes from the nonsmooth nature of the objective function . Specifically, when is nonsmooth or its gradient lacks global Lipschitz continuity, a straightforward approach involves resorting to the smoothing method, a pivotal aspect in our analytical framework. In the context of this study, we introduce an algorithm utilizing the smoothing function delineated in [8]. This smoothing function serves the purpose of approximating the nonsmooth convex function by a set of smooth convex functions, thereby facilitating the application of gradient-based optimization techniques.
Definition 2.2 ([8]).
For convex function in (2), we call a smoothing function of , if satisfies the following conditions:
(i) for any fixed , is continuously differentiable on ;
(ii) ;
(iii) (gradient consistence) ;
(iv) for any fixed , is convex on ;
(v) there exists a such that
(vi) there exists an such that is Lipschitz continuous on with factor for any fixed .
Combining properties (ii) and (v) in Definition (2.2), we have
The exploration of smooth approximations for diverse specialized nonsmooth functions has a venerable lineage, yielding a wealth of theoretical insights [8], [13], [34], [40], [22]. The foundational conditions (i)–(iii) articulated herein are integral elements in the characterization of a smoothing function, as delineated in [8]. These conditions are imperative for ensuring the efficacy of smoothing methods when applied to the resolution of corresponding nonsmooth problems. Condition (iv) stipulates that the smoothing function preserves the convexity of for any fixed . Conditions (v) and (vi) serve to guarantee the global Lipschitz continuity of for any fixed and the global Lipschitz continuity of for any fixed , respectively. These conditions collectively establish a foundation for the utility and effectiveness of the smoothing function in the context of nonsmooth optimization problems.
We now revisit the optimality criteria for the multiobjective optimization problem denoted as (1). An element is deemed weakly Pareto optimal if there does not exist such that , where represents the vector-valued objective function. The ensemble of weakly Pareto optimal solutions is denoted as . The merit function , as introduced in [43], is expressed in the following manner:
(3) |
The following lemma proves that is a merit function in the Pareto sense.
3. The Smoothing Accelerated Proximal Gradient Method with Extrapolation term for Non-smooth Multi-objective Optimization
This section introduces an accelerated variant of the proximal gradient method tailored for multiobjective optimization. Drawing inspiration from the achievements reported in [1], we incorporate extrapolation techniques with parameters , where . Choosing the smoothing function as defined in Definition (2.2), we formulate an accelerated proximal gradient algorithm to solve the multiobjective optimization problem denoted as (1). The algorithm achieves a faster convergence rate while also gain the sequential convergence.
Subsequently, we present the methodology employed to address the optimization problem denoted as (1). Similar to the exposition in [42], a subproblem is delineated and resolved in each iteration. Using the descent lemma, the proposed approach tackles the ensuing subproblem for prescribed values of , , and :
(4) |
where
(5) |
Since is convex for all is strongly convex.Thus,the subproblem (4) has a unique optimal solution and attain the optimal function value ,i.e.,
(6) |
Furthermore, the optimality condition associated with the optimization problem denoted as (4) implies that, for all and , there exists and a Lagrange multiplier such that
(7) |
(8) |
where denotes the standard simplex and
(9) |
Before we present the algorithm framework, we first give the following assumption.
Assumption 3.1.
Suppose is set of the weakly Pareto optimal points and , then for any , then there exists such that and
For easy of reference and corresponding to its structure, we call the proposed algorithm the smoothing accelerated proximal gradient method with extrapolation term for nonsmooth multiobjective optimization(SAPGM) in this paper.The algorithm is in the following form.
4. The convergence rate analysis of SAPGM
4.1. Some Basic Estimation
This section shows that SAPGM has a faster convergence rate than under the Assumption (3.1). For the convenience of the complexity analysis, we use some functions defined in [42].For ,let and be defined by
(10) | ||||
Given a fixed weakly Pareto solution , define the global energy function which serves for Lyapunov analysis:
(11) | ||||
where .
Following the properties outlined in [47], we present the following properties regarding the sequence .
Proposition 4.1.
(i) the sequence is non-increasing for all
(ii) for every
Proof.
Before proving the proposition, we should discuss the following inequality where is crucial for proving: For any and ,it holds that for all
(13) | ||||
From step 4 and step 7 of the algorithm, we can see that for all , the following inequality holds
(14) | ||||
Set
We noticed that for a fixed and , function in as coefficient of strong convex function. Therefore, has a unique global minimum on , its record of , namely:
Since is the minimum, we infer that
(15) |
Combining Step 3 and Step 7 with the definition of the proximal operator, we get
(16) |
Taking and in inequality (15), we have
After some rearrangement, we deduce that, for any ,
(17) | |||
Adding (17) and (14), and using the convexity of , we deduced that, for any ,
(18) | ||||
So we have proven that inequality (13) holds.
Recalling that define in [47], in order to be exactly, we use :
where . We can see that if we just let be replaced by ,let be the weak Pareto solution of , then we get
Combining (13) and this definition, the define in our article has following relation with :
So we can get the following two inequalities of and , which are basic for this discussion:
(19) |
and
(20) | ||||
The rest of the proof is similar to the Proposition 3.1 proof in the article[47], so we don’t want to go into details. ∎
As a result of Proposition (4.1), we obtain some important properties of as shown below, where we need to introduce an important lemma on sequence convergence.
Lemma 4.3.
is non-increasing.
Proof.
The iterative format of shows that it is non-increasing, ∎
Theorem 4.4.
Supposebe the sequences generated by SAPGM, for any ,it holds that
(i) ;
(ii) exists;
(iii);
(iv)
Proof.
Before proving, we state that the proof of (i),(ii), and (iii) are similar to the proof of Proposition 3.2 in [47].
(i) By summing inequality (12) from to , we obtain:
Now, letting in the above inequality and using Proposition (ii), since and for all , we can infer that
(21) |
Since for all , it holds that
We further obtain:
This inequality follows from the fact that is increasing for all .
Therefore, inequality (21) implies:
(ii) Returning to inequality (19) and using the identity
with , and , we deduce:
By the definition of in SAPGM and multiplying the inequality by , we get:
Since , we can rearrange terms to obtain:
(22) |
Next, observe that
which leads to:
(23) | ||||
For simplicity, define:
Substituting (23) into (22), we obtain:
(24) |
Taking the positive part of the left-hand side and using inequality (21), we find:
Since , by Lemma 4.2, we infer that exists.
(iii) In view of , we observe that
combining which with (24), then we obtain
Summing up the above inequality for , we obtain
(25) | ||||
(iv) Combining (i) and (ii), we have
which implies
Observe that , then
(26) |
Combining this with (ii), we obtain
(27) |
by , which further implies
(28) |
Recalling the definition of in step 2 of algorithm,the non-increment of and (10), the second equation in (28) implies
By the definition of ,we get that
From the definition of and the fact that is a weak Pareto point, we infer that
So we have
Because , we get that
So we know that
This result illustrates that for any in the SAPGM algorithm, it holds ∎
4.2. Sequential Convergence
In this subsection, we are ready to analyze the convergence of the iterates generated by the SAPGM. In this context, we articulate the discrete manifestation of Opial’s lemma, laying the groundwork for a rigorous examination of the convergence properties inherent in the sequence
Lemma 4.5 ([47] Lemma 3.4).
Let be a nonempty subset of and be a sequence of
Assume that
(i) exists for every
(ii) every sequential limit point of sequence as belongs to S.
Then, as , converge to a point in .
To prove the sequential convergence, we must recall the following inequality on nonnegative sequences, which will be used in the forthcoming sequential convergence result.
Lemma 4.6 ([47] Lemma 3.5).
Assume .Let and be two sequences of nonnegative numbers such that
for all .If ,then
Theorem 4.7.
Let be the sequence generated by the algorithm. Then, as , the sequence converges to a weak Pareto solution of the original problem.
Proof.
Let be the sequence generated by SAPGM, and let be its cluster point. For any , if we can prove that , then is a weak Pareto optimal solution of the original problem.
Because , we only need to prove
Therefore, we can reform the problem by some smoothing function properties in [8].
Through the subproblem , we can get
Due to the convexity of , we have
Furthermore, the convexity of leads to
Therefore, we gain
Letting tend to 0 monotonically, we get
At the same time, letting , we have
Thus, is a weak Pareto optimal solution of the original problem.
Next, if we prove that for all weak Pareto optimal solution , exists, then we can deduce the convergence of the sequence from the lemma.
Because and following inequality
we get
which implies
Then,
Let , we have
Remark 4.8.
Now, from the sequence convergence, we set
as the algorithm-stopping criterion. From the above proof process, it is very natural to see the reason for our setting.
5. Efficient computation of the subproblem via its dual
In the previous section, we proved the global convergence and complexity results of SAPGM. Subsequently, our focus shifts to empirically assessing the method’s practical efficacy. Specifically, we elucidate a methodology for computing the subproblem. To commence, let us introduce a formal definition.
(29) |
for all . Then, fixing some , we can rewrite the objective function as
Based on the discussion in [42], we obtain the dual problem as follows:
(30) | ||||
where
(31) | ||||
Given the identification of the global optimal solution for the dual problem (30), it becomes feasible to construct the optimal solution for the original subproblem as follows:
(32) |
where prox denotes the proximal operator.
Additionally, is differentiable, which can make the Frank-Wolfe method easy to achieve, as the following Lemma shows.
Lemma 5.1 ([42],Theorem 6.1).
The function defined by is continaously differentiahle at every and
where prox is the proximal operator, and is the Jacobian matrix at given by
The proof is similar to that in [42]. This theorem establishes that the dual problem denoted as (30) constitutes an -dimensional differentiable convex optimization problem. Consequently, the effective computation of the proximal operator for the summation in a rapid manner would enable the resolution of (30) through the application of convex optimization techniques.
6. Numerical experiments
In this section, we present numerical results to show the good performance of the SAPGM algorithm for solving (1). The numerical experiments are performed in Python 3.10 on a 64-bit Lenovo PC with a 12th Gen Intel(R) Core(TM) i7-12700H CPU @ 2.70 GHz and 16GB RAM. To compare with the SAPGM, we use DNNM [18], the descent method for local Lipschitz multi-objective optimization problems, to conduct controlled experiments on the same test problems. For simplicity, we use Iter to represent the number of iterations and Time to represent the amount of time a program takes to run.
For convenience, we introduce some smoothing functions as follows: For the maximum function ,we use its smoothing function [14] as follow:
For the maximum function ,it can be represented by becase .
For the -norm function ,we define its smoothing function as follow:
To demonstrate the performance of SAPGM, we selected the DNNM algorithm as a comparison algorithm and chose three types of problems as our benchmark tests: small-scale bi-objective optimization problems, large-scale bi-objective optimization problems with sparse structures, and tri-objective optimization problems. The objective functions in the test problem are selected from [[23],[28],[47]].Now we list them in Table LABEL:test_problem:
Problem | Functions | |
---|---|---|
Large scale problem | ||
CR & MF2 | ||
CB3 & LQ | ||
CB3 & MF1 | ||
JOS1 & | ||
BK1 & | ||
SP1 & |
For the large-scale bi-objective optimization problems with sparse structures, we selected three sparsity levels: 10%, 20%, and 50%. For a given group of (m, n, Spar), the data in a large-scale problem is generated as follows:
The parameter settings for the DNNM algorithm can be referenced below:
The parameter settings for the DNNM algorithm can be referenced below:
To demonstrate that using objective functions like JOS1 in three-objective test problems is reasonable, we compare them with the fast proximal gradient algorithm for multi-objective optimization [50]. For convenience, we refer to it as FPGA. This comparison shows that the SAPGM algorithm can degenerate into FPGA, thereby confirming that the composition of the three-dimensional test problems is appropriate. The results are listed in Table LABEL:com_purity and Figure 1. It can be seen from the results that although the involvement of smoothing causes SAPGM to be slower than FPGA on smooth problems, both can obtain similar Pareto fronts. This indicates that SAPGM can degenerate into FPGA.
Problem | SAPGM | FPGA | ||||||
---|---|---|---|---|---|---|---|---|
purity | hvs | purity | hvs | |||||
JOS1 | 0.9155 | 0.0787 | 0.8684 | 0.1163 | 0.9155 | 0.0787 | 0.8684 | 0.1163 |
BK1 | 0.9670 | 0.4703 | 0.9989 | 0.0920 | 0.9520 | 0.1259 | 0.6884 | 0.0221 |
SP1 | 0.9437 | 0.1070 | 0.6819 | 0.0838 | 0.7370 | 0.3318 | 1.4252 | 0.0187 |


(a) JOS1


(b) BK1


(c) SP1
We use the following metrics to evaluate the performance of the algorithms:
Number of Iterations: The total number of iterations required to meet the stopping criteria.
Time: The time taken to satisfy the stopping criteria.
Purity [2]: This metric represents the proportion of solutions obtained by a given solver that lie within the approximated Pareto frontier.
Hypervolume(hvs) [51]: This metric quantifies the volume of the objective space dominated by the obtained Pareto frontier.
Spread Metrics ( and ) [10]: These metrics assess the distribution of solutions across the Pareto frontier.
Additionally, we constructed performance profiles [11] for each evaluation metric to facilitate a comprehensive comparison of the algorithms.
We now check the performance of the algorithms. For each problem above, we run the algorithms with 200 different initial points, in which Figure 3 to Figure 5 are the Pareto front of the large-scale double objective optimization problems, Figure 6 is the front of the three-objective optimization problems, and Figure 2 is the front of the small-scale double objective optimization problems. In general, SAPGM can map the problem ground surface well, while the DNNM algorithm can not accurately reflect the problem ground surface in some problems. Table 3 shows the average of the computational time and iteration counts for each problem. From the table, it is possible to see that acceleration is in general more efficient in terms of time. In fact, by checking the performance profiles given in Figure 7(a) and Figure 7(b), we observe that SAPGM performs better in terms of iteration counts and time.
Besides the performance, it is usually important to see how good the Pareto frontier is. Thus, once again we show performance profiles, spread metric (Figure 7(c)), spread metric (Figure 7(d)) hypervolume (Figure 7(e)) and this time for purity (Figure 7(f)). SAPGM outperforms the DNNM, obtaining better Pareto frontiers. We can thus conclude that at least among the test problems considered, SAPGM seems promising both in terms of performance and uniform Pareto frontiers.
In cases where the SAPGM algorithm exhibits the same number of iterations across different problems as shown in the table, we discovered that the number of iterations is related to the parameter constraints of . As the constraints are reduced, the number of iterations changes, but this does not significantly affect the characterization of the Pareto front. Three kinds of problems, the CR&MF2, JOS1& and large scale problem ((m,n)=(500,100),spar=10%), are selected as test problems to explore the influence of different on algorithm iteration times and Pareto frontier characterization. The results are listed in Table 4 to Table 6.In Table 6, hypervolume is zero due to the low sparsity of the initial point. It does not significantly affect the results. We find that with the decrease of , the number of iterations and running time increase. However, judging from the performance profiles used before, the decrease of does not strengthen the characterization of the Pareto frontier but achieves slightly worse results in some problems. So we confirm that ’s choice of 1e-3 is a reasonable choice.
Class | Problem | SAPGM | DNNM | ||
iter | time | iter | time | ||
Two obj | CR&MF2 | 43600 | 97.6322 | 200000 | 244.7978 |
CB3&LQ | 43600 | 128.4447 | 223294 | 471.4855 | |
CB3&MF1 | 43600 | 96.1398 | 150972 | 348.6537 | |
Large scale | Spar = 0.1 | ||||
500*100 | 2200 | 27.4354 | 95498 | 10941.3457 | |
1000*200 | 2200 | 50.6058 | 31635 | 2558.4360 | |
2000*400 | 2200 | 208.9523 | 199316 | 82865.6740 | |
Spar = 0.2 | |||||
500*100 | 2200 | 32.7338 | 165255 | 11572.2238 | |
1000*200 | 2200 | 65.8336 | 173010 | 21074.3461 | |
2000*400 | 2200 | 322.2478 | 198621 | 41724.1614 | |
Spar = 0.5 | |||||
500*100 | 2200 | 28.2064 | 194611 | 12932.9436 | |
1000*200 | 2200 | 80.7623 | 113755 | 14413.0340 | |
2000*400 | 2200 | 171.9536 | 200000 | 48799.6603 | |
Three obj | JOS1& | 43600 | 129.6857 | 48344 | 132.5368 |
BK1& | 43600 | 346.8673 | 761652 | 1311.6791 | |
SP1& | 43600 | 345.6614 | 196950 | 405.0506 |
Metric | CR & MF2 | |||||
---|---|---|---|---|---|---|
=1e-1 | =1e-2 | =1e-3 | =1e-5 | =1e-7 | =1e-10 | |
Iter | 800 | 6200 | 43600 | 200000 | 200000 | 200000 |
Time | 1.1963 | 8.1796 | 93.1940 | 629.3381 | 633.3625 | 2046.4206 |
Purity | 0 | 0.0693 | 0.8713 | 0.8713 | 0.8713 | 0.8713 |
/ | / | 6.9795 | 6.9795 | 6.9795 | 6.9795 | |
/ | / | 0.8068 | 0.8068 | 0.8068 | 0.8068 | |
hvs | 0 | 0 | 128.0904 | 128.0904 | 128.0904 | 128.0904 |
Metric | JOS1 & | |||||
---|---|---|---|---|---|---|
=1e-1 | =1e-2 | =1e-3 | =1e-5 | =1e-7 | =1e-10 | |
Iter | 800 | 6200 | 43600 | 200000 | 200000 | 200000 |
Time | 1.9047 | 17.9726 | 129.6857 | 1644.7979 | 1645.9337 | 2146.4206 |
Purity | 0 | 0 | 0.9559 | 0.9559 | 0.9559 | 0.9559 |
/ | / | 0.1591 | 0.1591 | 0.1591 | 0.1591 | |
/ | / | 0.8635 | 0.8635 | 0.8635 | 0.8635 | |
hvs | 0 | 0 | 0.5866 | 0.5866 | 0.5866 | 0.5866 |
Metric | Large scale problem when (m,n,Spar) = (500,100,10%) | |||||
---|---|---|---|---|---|---|
=1e-1 | =1e-2 | =1e-3 | =1e-5 | =1e-7 | =1e-10 | |
Iter | 800 | 2200 | 2200 | 2200 | 2200 | 2200 |
Time | 14.7241 | 50.6055 | 27.4354 | 36.1575 | 44.6184 | 40.6530 |
Purity | 1.0000 | 1.0000 | 0.9600 | 0.9570 | 0.9570 | 0.9570 |
0.3364 | 0.3256 | 0.3256 | 0.3256 | 0.3256 | 0.3256 | |
1.9225 | 2.3421 | 2.3420 | 2.3421 | 2.3421 | 2.3421 | |
hvs | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |


(a) CR&MF2


(b) CB3&LQ


(c) CB3&MF1


(a) spar = 10%,(m,n) = (500,100)


(b) spar = 10%,(m,n) = (1000,200)


(c) spar = 10%,(m,n) = (2000,400)


(a) spar = 20%,(m,n) = (500,100)


(b) spar = 20%,(m,n) = (1000,200)


(c) spar = 20%,(m,n) = (2000,400)


(a) spar = 50%,(m,n) = (500,100)


(b) spar = 50%,(m,n) = (1000,200)


(c) spar = 50%,(m,n) = (2000,400)


(a) BK1&


(b) JOS1&


(c) SP1&






7. Conclusions
In this paper, we propose a Smoothing Accelerated Proximal Gradient (SAPG) algorithm designed for the resolution of nonsmooth convex multiobjective optimization problems. Each iteration involves employing the accelerated proximal gradient with an extrapolation coefficient of to minimize the problem (1) with a fixed smoothing parameter, followed by an update to the smoothing parameter. Besides, we prove its convergence rate by a global energy function, which improves to . Additionally, theoretical proofs affirm that the iterates sequence converges to an optimal solution to the problem. An effective strategy for solving the subproblem is presented through its dual representation. The results of numerical experiments underscore the superior performance of the SAPG algorithm and underscore the importance of extrapolation in achieving faster convergence rates.
For future work, we plan to discuss the influence of parameters , and on the convergence speed of the algorithm, and give more general parameter selection criteria. This will be more conducive to the application of the algorithm to specific problems and enhance the specific application value of the algorithm. In addition, we will also study whether SAPGM has a good effect on problems with higher dimensions and more objective functions, and we hope to replace norm with norm in the target problem, which is conducive to the application of the algorithm in large-scale sparse optimization problems. But it also means that we need more theories to support our research on these goals.
Acknowledgments
We would like to thank you for following the instructions above very closely. It will save us lot of time and expedite the process of your article’s publication.
References
- [1] H. Attouch and J. Peypouquet, The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than , SIAM Journal on Optimization, 2016, 26(3): 1824-1834.
- [2] S. Bandyopadhyay, S. K. Pal and A. B. Aruna, Multiobjective GAs, quantitative indices, and pattern classification, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2004, 34(5): 2088-2099.
- [3] N. Belgasmi, L. Ben Saïd and K. Ghédira, Evolutionary multiobjective optimization of the multi-location transshipment problem, Operational Research, 2008, 8: 167-183.
- [4] H. Bonnel, A. N. Iusem and B. F. Svaiter, Proximal methods in vector optimization, SIAM Journal on Optimization, 2005, 15(4): 953-970.
- [5] G. A. Carrizo, P. A. Lotito and M. C. Maciel, Trust region globalization strategy for the nonconvex unconstrained multiobjective optimization problem, Mathematical Programming, 2016, 159: 339-369.
- [6] A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences, 2009, 2(1): 183-202.
- [7] A. Chambolle and C. Dossal, On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm”, Journal of Optimization Theory and Applications, 2015, 166: 968-982.
- [8] X. Chen, Smoothing methods for nonsmooth, nonconvex minimization, Mathematical Programming, 2012, 134: 71-99.
- [9] A. Chinchuluun and P. M. Pardalos, A survey of recent developments in multiobjective optimization, Annals of Operations Research, 2007, 154(1): 29-50.
- [10] A. L. Custódio, J. F. A. Madeira, A. I. F. Vaz and L. N. Vicente, Direct multisearch for multiobjective optimization, SIAM Journal on Optimization, 2011, 21(3): 1109-1140.
- [11] E. D. Dolan and J. J. Moré, Benchmarking optimization software with performance profiles, Mathematical Programming, 2002, 91: 201-213.
- [12] E. K. Doolittle, H. L. M. Kerivin and M. M. Wiecek, Robust multiobjective optimization with application to internet routing, Annals of Operations Research, 2018, 271: 487-525.
- [13] F. Facchinei and J.-S. Pang, Finite-dimensional variational inequalities and complementarity problems, Springer New York, 2003.
- [14] Y. Feng, L. Hongwei, Z. Shuisheng, et al., A smoothing trust-region Newton-CG method for minimax problem, Applied Mathematics and Computation, 2008, 199(2): 581-589.
- [15] J. Fliege and B. F. Svaiter, Steepest descent methods for multicriteria optimization, Mathematical Methods of Operations Research, 2000, 51: 479-494.
- [16] J. Fliege, L. M. G. Drummond and B. F. Svaiter, Newton’s method for multiobjective optimization, SIAM Journal on Optimization, 2009, 20(2): 602-626.
- [17] E. H. Fukuda and L. M. Graça Drummond, Inexact projected gradient method for vector optimization, Computational Optimization and Applications, 2013, 54: 473-493.
- [18] B. Gebken and S. Peitz, An efficient descent method for locally Lipschitz multiobjective optimization problems, Journal of Optimization Theory and Applications, 2021, 188: 696-723.
- [19] A. A. Goldstein, Optimization of Lipschitz continuous functions, Mathematical Programming, 1977, 13: 14-22.
- [20] M. Haarala, K. Miettinen and M. M. Mäkela, New limited memory bundle method for large-scale nonsmooth optimization, Optimization Methods and Software, 2004, 19(6): 673-692.
- [21] J. Hakanen and R. Allmendinger, Multiobjective optimization and decision making in engineering sciences, Optimization and Engineering, 2021, 22: 1031-1037.
- [22] J. B. Hiriart-Urruty and C. Lemaréchal, Convex analysis and minimization algorithms I: Fundamentals, Springer Science & Business Media, 1996.
- [23] S. Huband, P. Hingston, L. Barone and L. While, A review of multiobjective test problems and a scalable test problem toolkit, IEEE Transactions on Evolutionary Computation, 2006, 10(5): 477-506.
- [24] M. Jaggi, Revisiting Frank-Wolfe: Projection-free sparse convex optimization, in International Conference on Machine Learning, 2013: 427-435.
- [25] Y. Jin (Ed.), Multi-objective machine learning, Springer Science & Business Media, 2006.
- [26] A. Kumari, Subgradient methods for non-smooth vector optimization problems, International Journal of Pure and Applied Mathematics, 2015, 102(3): 563-578.
- [27] L. R. Lucambio Pérez and L. F. Prudente, Nonlinear conjugate gradient methods for vector optimization, SIAM Journal on Optimization, 2018, 28(3): 2690-2720.
- [28] L. Lukšan and J. Vlcek, Test problems for nonsmooth unconstrained and linearly constrained optimization, Technical report, 2000.
- [29] N. Mahdavi-Amiri and R. Yousefpour, An effective nonsmooth multiobjective optimization method for finding a weakly Pareto optimal solution of nonsmooth problems, International Journal of Applied and Computational Mathematics, 2012, 1(1): 1-21.
- [30] M. M. Mäkela and P. Neittaanmäki, Nonsmooth optimization: Analysis and algorithms with applications to optimal control, World Scientific, 1992.
- [31] M. M. Mäkelä, Multiobjective proximal bundle method for nonconvex nonsmooth optimization: Fortran subroutine MPBNGC 2.0, Reports of the Department of Mathematical Information Technology, Series B. Scientific Computing, 2003, 13.
- [32] O. Montonen, N. Karmitsa, and M. M. Mäkelä, Multiple subgradient descent bundle method for convex nonsmooth multiobjective optimization, Optimization, 2018, 67(1), 139-158.
- [33] Y. Nesterov, A method for unconstrained convex minimization problem with the rate of convergence , Dokl. Akad. Nauk SSSR, 1983, 269(3): 543.
- [34] Y. Nesterov, Smooth minimization of non-smooth functions, Mathematical Programming, 2005, 103: 127-152.
- [35] Y. Nishimura, E. H. Fukuda and N. Yamashita, Monotonicity for Multiobjective Accelerated Proximal Gradient Methods, arXiv preprint arXiv:2206.04412, 2022.
- [36] Y. Nikulin, K. Miettinen, and M. M. Mäkelä, A new achievement scalarizing function based on parameterization in multiobjective optimization, OR Spectrum, 2012, 34, 69-87.
- [37] P. Ren, Z. Zuo, and W. Huang, Effects of axial profile on the main bearing performance of internal combustion engine and its optimization using multiobjective optimization algorithms, Journal of Mechanical Science and Technology, 2021, 35, 3519-3531.
- [38] M. Rockafellar, Convex analysis, Princeton University Press, 1997.
- [39] M. Rocca, Sensitivity to uncertainty and scalarization in robust multiobjective optimization: An overview with application to mean-variance portfolio optimization, Annals of Operations Research, 2022: 1-16.
- [40] R. T. Rockafellar and R. J. B. Wets, Variational analysis, Springer, 1998.
- [41] E. E. Rosinger, Interactive algorithm for multiobjective optimization, Journal of Optimization Theory and Applications, 1981, 35, 339-365.
- [42] H. Tanabe, E. H. Fukuda, and N. Yamashita, An accelerated proximal gradient method for multiobjective optimization, Computational Optimization and Applications, 2023, 1-35.
- [43] H. Tanabe, E. H. Fukuda, and N. Yamashita, New merit functions for multiobjective optimization and their properties, Optimization, 2023, 1-38.
- [44] H. Tanabe, E. H. Fukuda, and N. Yamashita, Proximal gradient methods for multiobjective optimization and their applications, Computational Optimization and Applications, 2019, 72, 339-361.
- [45] H. Tanabe, E. H. Fukuda, and N. Yamashita, A globally convergent fast iterative shrinkage-thresholding algorithm with a new momentum factor for single and multi-objective convex optimization, arXiv preprint arXiv:2205.05262, 2022.
- [46] P. Wang and Y. Ma, A dynamic multiobjective evolutionary algorithm based on fine prediction strategy and nondominated solutions-guided evolution, Applied Intelligence, 2023, 1-22.
- [47] F. Wu and W. Bian, Smoothing Accelerated Proximal Gradient Method with Fast Convergence Rate for Nonsmooth Convex Optimization Beyond Differentiability, Journal of Optimization Theory and Applications, 2023, 197(2), 539-572.
- [48] Z. Xia, Y. Liu, J. Lu, et al., Penalty method for constrained distributed quaternion-variable optimization, IEEE Transactions on Cybernetics, 2020, 51(11), 5631-5636.
- [49] W. Xian, F. Huang, Y. Zhang, et al., A faster decentralized algorithm for nonconvex minimax problems, Advances in Neural Information Processing Systems, 2021, 34, 25865-25877.
- [50] J. Zhang and X. Yang, The convergence rate of the accelerated proximal gradient algorithm for Multiobjective Optimization is faster than , arXiv preprint arXiv:2312.06913, 2023.
- [51] E. Zitzler, Evolutionary algorithms for multiobjective optimization: Methods and applications, Shaker, Ithaca, 1999.
Received xxxx 20xx; revised xxxx 20xx; early access xxxx 20xx.