Assessment of Arctic sea ice simulations in CMIP5 models using a synthetical skill scoring method

The Arctic sea ice cover has declined at an unprecedented pace since the late 20th century. As a result, the feedback of sea ice anomalies for atmospheric circulation has been increasingly evidenced. While climatic models almost consistently reproduced a decreasing trend of sea ice cover, the reported results show a large distribution. To evaluate the performance of models for simulating Arctic sea ice cover and its potential role in climate change, this study constructed a reasonable metric by synthesizing both linear trends and anomalies of sea ice. This study particularly focused on the Barents Sea and the Kara Sea, where sea ice anomalies have the highest potential to affect the atmosphere. The investigated models can be grouped into three categories according to their normalized skill scores. The strong contrast among the multi-model ensemble means of different groups demonstrates the robustness and rationality of this method. Potential factors that account for the different performances of climate models are further explored. The results show that model performance depends more on the ozone datasets that are prescribed by the model rather than on the chemical representation of ozone.


Introduction
In recent decades, the high latitudes of the Northern Hemisphere showed the most visible signal of climate change and surface warming, which is at least twice as severe as the global average (the Fifth Assessment Report of Intergovernmental Panel on Climate Change, i.e., IPCC AR5). This Arctic amplification effect has been attributed to several factors, including atmospheric and oceanic heat transportation and the solar radiation effect (Holland and Bitz, 2003;Alexeev et al., 2005;Graversen et al., 2008;Serreze et al., 2009;Screen and Simmonds, 2010a;Walsh, 2014). Among these, the retreating sea ice cover plays a central role (Serreze et al., 2009;Screen and Simmonds, 2010a;Chapman and Walsh, 2007).
Climate change in the Arctic is asymmetric and differs in different regions (Overland et al., 1997;Venegas and Mysak, 2000;Semenov and Bengtsson, 2003;Rogers et al., 2013). Several places such as the Barents Sea and the Kara Sea (BK) are important for Arctic climate change, which is likely due to the active exchange of heat and momentum at the ocean-atmosphere interface of these regions. Yang and Yuan (2014) pointed out that the weakening and collapse of the wintertime sea-ice double dipole mode can be largely attributed to the thermal effect of significant BK sea ice reduction during late fall and early winter. In addition to the local effect, several regional signals in the atmosphere-sea ice-ocean system can be amplified, which extend to affect the Arctic as a whole or even further to affect lower latitudes (Semenov and Bengtsson, 2003;Bengtsson et al., 2004;Semenov, 2008;Semenov and Halford, 2009;Smedsrud et al., 2013). For example, anomalous low BK sea ice may impact the European climate through an atmospheric bridge, leading to unusually cold winters in Europe (Petoukhov and Semenov, 2010). In fact, the BK sea ice has been widely recognized as an important forcing factor for the mid-to-high latitude wintertime climate such as the position and strength of storm tracks, blocking systems, extreme events, and Arctic oscillation (Petoukhov and Semenov, 2010;Francis and Vavrus, 2012;Yang et al., 2016;Ruggieri et al., 2016). The distinction of BK sea ice in the Arctic climate system was further reported by Kim et al. (2016). Unlike other Arctic marginal seas, BK sea ice anomalies can extend throughout an entire year due to the so-called "insulation feedback" mechanism. Increased reception of solar radiation in summer and the heat stored in the BK seas can be released during the fall-winter season, resulting in "delayed warming" (Francis et al., 2009;Deser et al., 2010;Screen and Simmonds, 2010b). This may account for the high correlation between the winter BK sea ice extent (SIE) and the summer Arctic SIE both in model simulations and observations (Li et al., 2017). Therefore, the presented assessment of the model performance plausibly indicates the BK sea ice as casting the Arctic climate change.
With regard to the complex interactions between atmospheric circulation and Arctic sea ice, climate models that couple oceans, atmosphere, and ice were used to diagnose the factors that affect changes of the Arctic climate and predict future climate changes. The Coupled Model Intercomparison Project (CMIP) of the World Climate Research Programme (WCRP) consists of a series of contemporary climate models. The IPCC fifth assessment report shows that the models that participate in CMIP5 (http://pcmdi3.llnl.gov/esgcet) generally achieve higher performance than those in CMIP3 when reproducing long-term trends of sea ice. Rosenblum and Eisenman (2016) reported that the inclusion of volcanic activity, rather than the improvement of sea ice physics or model resolution, accounts for the priority of the CMIP5 over the CMIP3 models for simulating Arctic sea ice trends. Although the observed seasonal cycle and long-term trend of Arctic SIE is well presented by CMIP5 models, the dispersion of the projected SIE throughout the 21st century by CMIP5 models has been similar to that by CMIP3 models (Stroeve et al., 2012;Huang et al., 2017). Stroeve and Notz (2015) further reported that the uncertainty range of future sea-ice evolution simulation remains large for climate models. Massonnet et al. (2012) provided several important metrics that constrain the projections of summer Arctic sea ice. By reducing the inter-model spread of CMIP5 projections, Liu et al. (2013) reproduced consistent Arctic ice-free periods.
To address the problems of inter-model dispersion and uncertainty with regard to Arctic sea ice simulation, this study provides a reasonable assessment of sea ice simulation in CMIP5. In this study, 30 CMIP5 models were objectively and comprehensively evaluated for their capability to simulate Arctic sea ice variability. Given the strong feedback of BK seas to Arctic climate and its interconnection with mid-latitude climate, the BK region was differentiated from the other Arctic regions (exBK) and a larger weight was assigned to the BK region. Both the long-term trends and anomalies of sea ice were considered. In addition to the BK-priority weighting method, a comprehensive and objective assessment framework was constructed to quantify the models' ability to simulate sea ice. Based on this framework, several better models can be identified to constrain model biases and to provide a better basis for the study of Arctic ocean-ice-atmospheric interaction as well as future Arctic climate change predictions. Moreover, the model parameters were further scrutinized and a possible way to improve model performance for Arctic sea ice simulation is suggested.

Data and method
The CMIP5 model simulation dataset (Taylor et al., 2012) can be directly downloaded via the website http://pcmdi3.llnl.gov/ esgcet/home.htm. Among others, 30 models were selected due to their intactness and availability of the sea ice dataset, as shown in Table 1. HadISST1 sea ice concentration (SIC) data are applied in this study as observation data to evaluate the models (Rayner et al., 2003). The HadISST1 is a global monthly sea surface temperature (SST) and sea ice dataset with a 1° × 1° grid that ranges from 1871 to the present, and is obtained from various sources including digitized sea ice charts and passive microwave retrieval. The dataset is made more homogeneous by compensating satellite microwave-based SICs for the impact of surface melt effects on retrievals in the Arctic and for algorithm deficiencies in the Antarctic, as well as by matching historical in situ concentrations with satellite data. Since in situ data prior to the satellite era are sparse and highly inhomogeneous, both the model and observational data were truncated to start from 1979.
The results of historical simulations were analyzed from 1979 to 2005, prolonged with Representative Concentration Pathways 8.5 (RCP8.5) simulations from 2006 to 2014. The exception is HadGEM2-CC and HadGEM2-ES, for which historical simulations from 1979 to 2004 were used, prolonged with RCP8.5 simu-lations from 2005 to 2014. Given the inconsistency of various grids and projections in CMIP5 models, model outputs were preprocessed by interpolating them onto the same grid as the Ha-dISST1 data.
Due to the significance of BK sea ice variability for recent climate change, a weighted scoring method was applied. The detailed processes of quantification are as follows: firstly, the fraction of the grid cell covered by SIC was multiplied by the area of grid cell to calculate sea ice area (SIA) for BK (70.5°-81.5°N, 19.5°-100.5°E) and exBK regions, respectively (as shown in Eq. (1)): (1) where the earth radius r = 6 731 km. Then, their linear trends were estimated using the least square method. Comparing the SIA trends of model outputs with observations, relative errors of the trends (Eq. (2)) were calculated. A lower absolute value of the relative error indicates a better performance of the models. R elative error =¯X mod ¡ X obs X obs¯; (2) where X mod and X obs represent the modeled and observational SIA trends, respectively. Secondly, detrended SIC anomaly time series were obtained for each grid, with both climatology and linear trend being subtracted from the original data. A quantitative comparison between the model results and observations was conducted using the method developed by Willmott (1981) (Eq. (3)): (3) ¹ X where X represents the variable, represents its time mean, and the subscripts mod and obs represent model results and observations, respectively. Skill (Sk) values were calculated in each grid, and the final skill value was obtained after averaging. A higher skill value indicates better model performance. A detailed description of this method can be found in Warner et al. (2005).
To bridge the gap between relative error and skill score, the residual relative error (RRE) (one minus the absolute value of relative error) was used instead of the relative error itself. For exBK and BK regions, four values (RRE exBK , Sk exBK , RRE BK and Sk BK ) were obtained. To synthetically evaluate the CMIP5 models with regard to sea ice simulation, these four components were normalized to obtain N-scores (i.e., N_RRE exBK , N_Sk exBK , N_RRE BK and N_Sk BK ). Finally, a weighted method was used to quantitatively combine trend error (Eq. (2)) and anomaly skill (Eq. (3)).
Since the BK sea ice may exert a far-reaching effect on the Arctic climate and a remote effect on the Northern Hemisphere extratropical atmospheric circulation, it was weighted more. Hence, the weight coefficients of four factors including sea ice trends of the exBK, sea ice trends of the BK, sea ice anomalies of the exBK, and sea ice anomalies of the BK were 0.1, 0.3, 0.2 and 0.4, respectively. Then, the final score can be calculated by adding these four weighted N-scores (F-score) and normalizing it to get the NF-score) (Eq. (4)).
The final scores for 30 models are listed in Table 2. The weight coefficients are subjective, which qualitatively reflect the relative (1-0.3)°×1°Semtner 0-layer, ITD, EVP 4 importance of various items rather than quantitatively. Therefore, the metric was chosen based on our understandings of the asymmetric sea ice variability within Arctic regions and the significance of BK sea ice variability for the Arctic climate change. The BK sea ice variability is most remarkable within the Arctic region both for interannual and multi-decadal time scales. Statistical analysis also showed that the BK sea ice variability is closely linked to other Arctic regions (data not shown).

Results
As mentioned above, a quantitative score of each model was obtained according to the synthetical skill scoring method. To group the models in a scientific and objective way, they have been sorted in descending order by their NF-scores (as shown in Table 2) and the root mean square (RMS) errors of the multimodel ensemble means (MMEMs) were computed with different numbers of model members (shown in Fig. 1). Three aspects of RMS were investigated: the SIA seasonal cycle (Fig. 1a), the standard deviation (Fig. 1b), and running trends (Fig. 1c). It is clear that almost all three lines of RMS values exhibited a consistent decreasing tendency when the ensemble members were less than 10. For ensemble members exceeding 10, the RMS lines either level off or ascend slightly with the inclusion of further model members. In the light of this RMS analysis, the first 10 models were grouped into the first group (Group I) and the other 20 models were evenly divided into two groups based on their weighted scores as reference groups (Groups II and III). Then, several metrics were applied to assess the rationality and robustness of this scoring and grouping system. Figure 2 shows the climatological seasonal cycle of SIA for each group. The RMS errors of the MMEM were 0.46, 0.59 and 1.79×10 6 km 2 for Groups I, II and III, respectively. Apparently, the MMEM of Group I conformed more to observation than that of either Group II or Group III. Observed SIA reached a maximum in March and a minimum in September. Models in Group I consistently reproduced this feature. However, for Groups II and III, the dispersions among model members clearly increased. Although almost all models qualitatively simulated the seasonal waxing and waning of sea ice, the values of monthly climatology of SIA varied as large as 2×10 6 -5×10 6 km 2 for Groups II and III. Even the MMEMs of these two groups clearly deviated from the observational data. The SIA was overestimated in wintertime and -2.69 Note: RRE represents the residual relative error for BK and exBK trends, and Sk represents the skill for BK and exBK anomalies. The normalized RRE and Sk are labeled as N_RRE and N_Sk for each component. The F-score was obtained from the weighted integral of four components (N_RREs and N_Sks). This F-score value and its normalized version NF-score are listed as grouping criterion in the last two columns in descending order. underestimated in summertime for Group II but was consistently overestimated throughout the year for Group III.

Multi-model ensemble mean state
To further probe into the spatial distribution of SIA climatology, the SIA was calculated in each longitude and the climatolo-gical meridional-integral SIA were presented in March and September for all three groups (Fig. 3). In March (Fig. 3a), MMEM SIAs of all three groups were highly conform with observations in longitudes from 60°E to 140°E and 180°W to 122°W, including Kara, Laptev, Bering, and Beaufort Seas. For other longitudes, Group I behaved much better than the other two groups. Group II MMEM tends to overestimate the SIA in Greenland and Barents Seas, while Group III MMEM tends to underestimate (overestimate) the SIA in Hudson (Greenland and Barents Seas) and adjacent regions. The gaps among the three groups were wider in September (Fig. 3b) than in March. Group III models overestimate the SIA in most parts of the Arctic Ocean, with the poorest performance in the Barents and Kara Seas. The overall skills between the MMEM of each group and the observation also showed a large difference, with scores of 0.97, 0.93 and 0.70 for the groups in descending order. The superiority of Group I over both Group II or Group III was most striking for the BK region in March and September, which was predictable due to the larger weights given to the BK in the developed evaluation system.
The SIA trend was also estimated for each longitude (as shown in Fig. 4). All MMEMs of the SIA trend decreased in all investigated regions, which qualitatively agreed with the observational results except for the region near Greenland where the observed SIA increased. The MMEMs of the SIA trend of three groups were well distinguished in the BK. The MMEMs of Group I and II were much closer to the observational data than that of Group III. In other regions, differences between three MMEMs and observation were less significant. The skills between the MMEM of each group and observation were 0.79 (Group I), 0.67 (Group II), and 0.33 (Group III).
The skill score distributions of the MMEM detrended SIC anomalies for each group in both the exBK region and the BK region are displayed in Figs 5 and 6, respectively. In the BK, the skill scores of Group I were much higher than that of either Group II or Group III (Fig. 6). With regard to the exBK region, the differences of the three groups were asymmetric and less significant (Fig. 5). The most rigid hierarchy was found in the Beaufort Sea and the Laptev Sea among the three groups. The mean values of spatial MMEM of skill scores for Groups I, II and III were 0.30, 0.24 and 0.24 in the exBK region and 0.35, 0.28 and 0.22 in the BK region, showing a clear descending order in the sequence of group number both in the BK and exBK. It seems that the models that achieved a better simulation of sea ice in the BK region may generally acquire higher scores in other Arctic regions, too.

Individual model contrast
The remarkable contrast among the MMEMs in different groups indicates the applicability of the weighted method to distinguish the models' capability for sea ice simulation in general. However, the ensemble means blur the individual differences between group members. It is therefore necessary to further explore the detailed skill score model by model. Thus, a heatmap was used to interpret the model differences in various terms of skill scores (Fig. 7). A heatmap is a graphical representation of data where individual values contained in a matrix are represented by colors. In a two-dimensional heatmap, the models were listed in descending order of their NF-scores and the model number is shown on the x-axis, and their scores of various subitems on the y-axis. The colored squares show the normalized Sea ice area trend/km 2 ·a -1 obs Ⅰ Ⅱ Ⅲ 179°  score values for each model and each sub-item. It is apparent that the first 10 models of Group I generally achieved much higher scores for each sub-item than the models of Group III. The models of Group II exhibit a rather chaotic order in the sub-item scores, which, to some extent, reflects their large dispersion. The contrast between these groups is more pronounced for the BK anomaly than for the exBK trend, since different weights were assigned to these sub-items. It is worth noting that the score of the exBK trend for the 4th model (model No. 13 GFDL-CM3, Group I) was extraordinarily low (≤ -1.0), which stands in contrast to the extraordinary high score (≥ 1.0) of the 27th model (model No. 26 MIROC-ESMMRI-CGCM3, Group III).
To further analyze the result, a detailed comparison of sea ice simulation of both models with observation was conducted (Fig. 8). Figure 8a shows the annual mean SIA time series. Models 26 and 13 both exhibit similar downward trends, which are consistent with the observations. However, Model 13 overestimated the ice decline trend to some extent, particularly for the recent two decades. Model 26, although reproducing the observational sea ice decrease rate, dramatically underestimated the climatological SIA, and the linear correlation coefficient of Model 26 with the observation data was 0.56, which is much smaller than that of Model 13 (0.75). For the seasonal cycle (Fig. 8b), Model 13 was much closer to observation than Model 26. To compare both model simulation results in detail, the difference distributions of trends was presented between Models 13, 26, and observational data (Figs 8c and d). Model 13 underestimated the SIA trends almost everywhere. In contrast, the trends of Model 26 were overestimated in the Beaufort Sea and the East Siberian Sea and underestimated in the Baffin Bay and the Bering Sea. The mean linear trends averaged over the total Arctic region were -29 980 km 2 /a for Model 13 and 343 km 2 /a for Model 26. Therefore, the superiority of Model 26 over Model 13 for simulating the exBK trend may be attributed to the asymmetric trend skill distribution in different sub-regions and the cancelation between them. If the absolute value of the skill score of the exBK trend in each grid is used, the difference of the mean skill distribution between the two models can be neglected.

Potential factors accounting for the performance of climate models
It seems that the applied weighting scores can well measure the capability of a model to simulate the sea ice. However, the reason for the dispersion of model simulations, particularly in Group II, remains unknown. The model parameters, the grid resolution, and the way that models represent stratospheric ozone have been proposed as potential factors that affect model performance in sea ice simulation (Turner et al., 2009;Sigmond and Fyfe, 2010;Zunz et al., 2013). To investigate the effect of ozone on the model performance, the ozone representation was listed and ozone datasets were prescribed for 30 CMIP5 models as reorganized by Eyring et al. (2013) in Table 3. According to the ozone representation, these 30 models can be roughly grouped into two categories. One contains 10 models with interactive or semi-offline chemistry ozone representation (CHEM, bold model names in Table 3), and the other contains 19 models with prescribed ozone representation (NOCHEM, normal model names in Table 3). Most of the NOCHEM models apply the prescribed ozone both in the stratosphere and the troposphere. The exception is HadGEM2-ES, which uses prescribed ozone in the stratosphere but an interactive ozone chemistry in the troposphere. Thus, HadGEM2-ES was removed from the following statistics to avoid ambiguity. Unlike the models with prescribed ozone, semi-offline was denoted if the prescribed ozone dataset has been calculated with the underlying CMIP5 chemistry-climate model using prescribed SSTs and SICs following historical emissions as reported by Lamarque et al. (2010) and future emissions under the RCP scenarios as described by Lamarque et al. (2011). These differ from the class of prescribed ozone CMIP5 models (NOCHEM), because their stratospheric ozone evolution responds to changes   . 7. Score heat map of four metrics and the weighted means of 30 models. One metric is the linear trend of the yearly sea ice area of the Arctic except for the Barents and Kara Seas (exBK), and another one is the anomaly of monthly detrended sea ice concentration of exBK. The other two metrics are the same as above but in the Barents and Kara seas. The magenta solid square represents a score > 1. Purple represents a score between 0.5 and 1. Blue-purple represents a score between -0.5 and 0.5. Steelblue represents a score between -1 and -0.5. Cyan represents a score < -1.
in greenhouse gas concentrations in four RCPs, although it is still calculated offline (Eyring et al., 2013). The average scores of four metrics and their weighted mean were calculated and are displayed in Fig. 9a. For the weighted mean, the scores were -0.22 for NOCHEM models and 0.41 for CHEM models. The strong contrast of scores between both categories suggests the superiority of interactive or semi-offline ozone chemistry over the prescribed ozone representation. Comparing model score groups in Table 2 and ozone representations in Table 3 showed that 8 out of the last 10 models (the lowest score models) applied the prescribed ozone representation (NOCHEM). Nevertheless, the CHEM and NOCHEM models were well-matched in the top group (3 CHEM models vs. 3 NOCHEM models in Group I and 7 CHEM models vs. 3 NOCHEM models in top 10 models). The BK anomaly skill scores were consistent with the weighted mean scores. In contrast, the linear trend scores were not so sensitive to the choice of ozone representation. This suggests that the application of the interactive and semi-offline ozone chemistry may greatly improve the model performance of Arctic sea ice simulation within interannual to decadal time scales, but exert little effect on the linear trend simulation.
To further explore the effect of ozone on model performance, the models with prescribed ozone representation and semi-offline ozone chemistry were subdivided according to the ozone datasets they used. The applied analysis examined whether various ozone datasets affect the simulation of sea ice. The ozone datasets used by 18 NOCHEM models include C 1 (10 models), P 5 (2 models), P 7 (2 models), C modA 2 (2 models), C modA 3 (1 model), and C modB 4 (1 model). The latter two datasets were excluded because their sample sizes were too small. The ozone datasets used by semi-offline chemistry models include P 2 (3 models) and P 6 (3 models). Furthermore, the scores of the interactive ozone chemistry models (4 models) were estimated in comparison to other models although they did not have any prescribed ozone data. The average scores of sub-group models with regard to four metrics (BK trend, exBK trend, BK anomaly and exBK anomaly) and their weighted mean scores were calculated and the results are shown in Fig. 9b. For weighted mean scores, the C modA 2 models (score=1.2) were far superior to the other three categories in the NOCHEM group and were even better than the CHEM models. Moreover, the C modA 2 models were superior to the other models and achieved high and balanced scores in all four metrics. In contrast to the P 6 models, the P 2 models in the semi-offline group achieved the second-highest score of 0.74. Like interactive ozone models (represented by "I" in Table 3), the P 2 model performed well when simulating sea ice anomalies but unsatisfactorily when simulating linear trends. The model preference can be demonstrated by comparing the ozone data for the top 10 models (Group I) and the lowest 10 models (Group III). Both sets of models with the prescribed C modA 2 ozone data and models with semioffline P 2 ozone data ranked among top 10 models, while none of the models with these two ozone datasets ranked in the lowest 10 models. It seems that model performance depends more on the quality of the ozone dataset than on whether ozone is prescribed or interactive.
Although the models showed remarkable bias due to the   choice of ozone datasets, other factors may also influence model preference. Therefore, the possible impact of model resolution on the performance of models was also investigated. However, no evidence was found for the superiority of high-resolution models over low-resolution models (data not shown). The scores of high-resolution models can be either higher or lower than those with low resolution, and even models with the same resolution can show entirely different sea ice simulation ability (e.g.,

Conclusion
In this study, the Arctic sea ice simulation of models that participated in CMIP5 was evaluated. Four metrics including the long-term trends and SIC anomalies in the interannual and decadal time scales in the BK as well as in the exBK were integrated with different weighted coefficients. Models were divided into three groups according to weighted scores. The feasibility and robustness of this assessment method was verified for various metrics. Finally, several parameters (mainly ozone representation) of the models were investigated to explain the discrepancies between models.
In general, the MMEM values of three groups were well distinguished in the annual cycle, linear trends, and interannual variability of the SIA, which demonstrates the rationality of the utilized evaluation criterion and the feasibility of constraining model uncertainty by selecting models in this way. With regard to individual models, a number of high-ranking models were superior to low-ranking models, not in all metrics, but in the core metrics such as BK trends and anomalies. This highlights the necessity of the weighted method in model assessment. Previous studies evaluated the model performance on sea ice simulation (Maslowski et al., 2012;Semenov et al., 2015;Shu et al., 2015). Climate models in general reproduce the sea ice retreating trend in the Arctic during the 20th century and simulate further sea ice area loss during the 21st century in response to anthropogenic forces. However, these models still suffer from large biases and model dispersions. Most previous studies thus applied the MMEM in order to abate the individual model bias and they mostly focused more on the sea ice tendency or long-term trend (Zhang and Walsh, 2006;Arzel et al., 2006). However, simulations vary from model to model and the sea ice variability is an important factor for atmospheric circulation. The presented assessment highlights the capability to simulate the BK sea ice trends and anomalies, due to their significance for driving the recent climate change both in the Arctic and in the mid-latitudes for each model. In addition, a preliminary discussion is presented regarding the reasons for model dispersion. Several parameters of models including resolution and ozone representation Note: -means the models using the interactive ozone chemistry have no prescribed ozone dataset.  Fig. 9. Bar graph of four metrics (N_RREs and N_Sks) and their weighted mean scores (NF-scores). a. Scores of models with two different stratospheric ozone representations. Black bar denotes the score of models using the prescribed ozone representation. White bar denotes the score of models using the interactive and semi-offline chemistry ozone representation. b. Scores of models that use different ozone representation datasets.
were investigated. The obtained results show that the model resolution does not significantly impact model performance, which is in accordance with Rosenblum and Eisenman (2016). Instead, the ozone datasets that models used were found to be an important factor. For NOCHEM model developers this indicates that the use of C modA 2 ozone data seems to be preferable. Although this study highlights the importance of the BK sea ice simulation for this assessment system, it should be regarded only as a necessary but not a sufficient condition for climate models toward the prediction of future climate change. Further research is required to explore the detailed physical processes and mechanisms with which ice anomalies exert their influences on both local and remote climate variability.