Assessing Corporate Sustainability Through Ratings: Challenges and Their Causes

Assessing corporate sustainability is increasingly practice-relevant, not least because the capital market and other markets have been paying growing attention to the topic. Recently, ratings have become an important assessment approach and nowadays a variety of organizations and financial service providers conduct their own ratings. Yet, despite their growing popularity, ratings are criticized in research and practice. Thus, the purpose of this paper is to systematize the challenges that corporate sustainability ratings face: lack of standardization, lack of credibility of information, bias, tradeoffs, lack of transparency, and lack of independence. Furthermore, the paper discusses the causes of these challenges and suggests possible ways to improve the reliability of ratings.


INTRODUCTION
Sustainability is a topic of growing significance for companies just like the contribution of companies is becoming essential for sustainable development (Dunphy, Griffiths, and Benn; Dyllick and Hockerts; Epstein; Schaltegger and Burritt).Corporate sustainability (CS) is understood here as an approach to systematically consider environmental and social issues and to integrate them into the economic management of a company (Dunphy, Griffiths, and Benn; Shrivastava and Hart).Increasingly, the demand for CS is not only driven by societal or political expectations, i.e. push factors, but also by the potential for internal organizational improvements (e. g., cost reduction), as well as the demand of consumers and investors, i.e. pull factors (Dyllick, Belz, and Schneidewind; Meffert and Kirchgeorg; Schaltegger and Wagner).Examples of this latter market pull are the rising demand for organic food (Wier and Calverley) and the growing significance of socially responsible investment (SRI) (Beloe, Scherer, and Knoepfel; Moskowitz; Sparkes and Cowton).
Against this background, the research question of this paper is what challenges CS ratings face and what their causes are.The paper is structured as follows.Firstly, after a short introduction to the relevance of ratings, it displays and systematizes the challenges for CS ratings based on a literature review.Several ratings are included for illustration purposes.Secondly, the paper determines the causes of these challenges by reviewing more general literature on CS and CS assessment.Thereupon, the paper identifies ways to improve the reliability of CS ratings.

II. BACKGROUND: RELEVANCE OF RATINGS IN THEORY AND PRACTICE
This section elaborates on the relevance of external CS assessment from a theoretical perspective, and then highlights the practical importance of ratings in particular.Table 1: Prevalent approaches to externally assess CS.

II.I RELEVANCE OF RATINGS FROM A THEORETICAL PERSPECTIVE
An important difficulty when assessing CS externally lies in information asymmetries (Lyon and Maxwell; Rischkowsky and Döring).Consumers, investors, and other stakeholders are not able to verify the sustainability claims made by companies, because they do not have access to the relevant information (Ramus and Montiel).This not only affects products (Jahn, Schramm, and Spiller) but also processes inside companies and along supply chains (Chatterji and Levine; Epstein).Reliable third party institutions with resources to gather the needed information become important players (Healy and Palepu; Lee and Cho; Rischkowsky and Döring).Ratings or rating organizations are one example of such information intermediaries.Another important aspect is that CS is socially desired (de Boer; Epstein).Ongoing discussions in the media as well as the increasing meaning of sustainability-oriented products, for example in the financial market, illustrate that society and markets are increasingly concerned with the topic (Hansen, Große-Dunker, and Reichwald; Meffert and Kirchgeorg; Sparkes and Cowton; Wier and Calverley).This fact may not only motivate companies to get involved with sustainability issues and to communicate about them, but also to exclusively communicate positive and leave out negative information.In an extreme case, companies may even perceive an incentive to pass on false information in order to improve their reputation or market share (Darby and Karni; Laufer, Rischkowsky and Döring).The risk of such opportunistic behavior, known as greenwashing, is increased by the lack of a definition of CS and the large scope of different interpretations (van Marrewijk).
The outcome of such a situation may be a "market for (organic) lemons": stakeholders cannot identify sustainability-oriented companies (hidden characteristics) because of a lack of information or of trust in the offered information.This leads to a diminished willingness to pay for the companies' products or a lower readiness to invest.Ultimately, sustainability-oriented companies may be crowded out of the market (adverse selection) (Akerlof; Rischkowsky and Döring).This market failure probably causes negative effects on the environment and society when sustainability-oriented companies are replaced by exclusively economically-oriented ones.Accordingly, the contribution of companies to sustainable development of the economy and society will diminish even more.
Both Economics of Information (e. g., Shapiro; Stigler; Stiglitz) as well as the principalagent theory (Jensen and Meckling) (and related approaches like the stakeholder-agency theory, see Hill and Jones) deal with ways to overcome asymmetric information or adverse selection in markets.They offer two basic approaches to this problem.The first approach is signaling (Spence).Signaling in this context means that companies emit credible signals indicating their sustainability orientation.Examples are the publication of sustainability reports offering stakeholders information on sustainability efforts, and the establishment and use of brands or labels transporting and substantiating sustainability related messages about products or companies (de Boer; Finch; Kolk).However, these signals only fulfill their function if the addressees perceive them as reliable (Müller; Rischkowsky and Döring).Yet, reliability is not always given due to the "climate of general distrust towards social organizations" (Renn and Levine 212) and the risk of opportunistic behavior.Therefore, signaling may be insufficient in the context of CS.
An alternative approach to overcome information asymmetries is screening, which here means that consumers, investors, or other stakeholders actively search for and evaluate information on the sustainability performance of companies (Rischkowsky and Döring; see also Stiglitz).Compared to earlier times, the Internet allows for much more transparency and information access today (Rezabakhsh, Bornemann, Hansen, and Schrader; Seelos).Yet, consumers and investors cannot access all relevant data as a matter of resource constraints (time and data access).Hence, information intermediaries come into play (Healy and Palepu; Lee and Cho; Rischkowsky and Döring).Ratings are an important example of this kind of external assessment, although screening for CS is complicated by the diverse perception of the concept.Yet, although several challenges have to be met in order to reliably assess CS by screening, it still appears more promising than signaling which makes opportunistic behavior easier (Graafland, Eijffinger, and Smid).Furthermore, screening simplifies the comparison of companies which could be relevant to consumers and investors.Therefore, this paper focuses on ratings as a practice-relevant application of screening.
Nonetheless, when differentiating between signaling and screening it has to be kept in mind that one approach cannot be seen separate from the other.On the one hand, the assessment made through screening can be used to substantiate companies' signaling approaches, which might be perceived as more reliable than information without external verification (Rischkowsky and Döring).Audits, labels, and certificates also follow this procedure.On the other hand, in order to carry out their assessment, ratings at least partially depend on the disclosure of information by companies, and thus, on suitable internal metrics (Chatterji and Levine).For these reasons, CS signaling and screening are interdependent.Intermediaries carry out the screening process for stakeholders and substantiate companies' signals.

II.II PRACTICAL RELEVANCE OF CS RATINGS
CS ratings have become increasingly practicerelevant ( Among the variety of CS assessment approaches ratings play a special role, since they not only constitute an assessment approach themselves but also form the basis for further benchmarking approaches like rankings and indices (for more details on ratings see Schäfer, Beer, Zenker, and Fernandes; for the methodologies of major sustainability indices see Fowler and Hope).Therefore, the procedures that ratings apply have consequences for subsequent approaches.
Despite the visible efforts to assess CS, related approaches and particularly ratings are criticized in both research and practice (Beloe, Scherer, and Knoepfel; Chatterji and Levine; Chatterji, Levine, and Toffel; Delmas and Doctori-Blass; Dillenburg, Greene, and Erekson; Fowler and Hope, Graafland, Eijffinger, and Smid; Hansen; Sadowski, Whitaker, and Buckingham, Rate the Raters.Phase One; Schäfer, Beer, Zenker, and Fernandes).Hence, Beloe, Scherer, and Knoepfel (29) conclude that many research organizations "will have to fundamentally review many aspects of their research methodology and approach," and Sadowski, Whitaker, Lee, and Ayars (5) conclude that "the market will settle on a few "winners"."The challenges that come along with CS ratings will be discussed in the following.Several practice-relevant ratings are drawn upon for illustration purposes.

III. CHALLENGES FOR CS RATINGS AND THEIR CAUSES
CS ratings are dealt with in research and practice.Although a certain amount of literature deals with the challenges for CS ratings, they have not been systematized so far.In section 3.1 six important aspects will be identified and elaborated: lack of standardization, lack of credibility of information, bias, tradeoffs, lack of transparency, and lack of independence.The synthesis builds on a review of academic literature as well as practice-relevant publications on ratings, indices, and related assessments of CS and identifies those aspects that are discussed in several publications.Table 2 offers an overview of the challenges and their meaning.Building on this, section 3.2 identifies the causes of the challenges and discusses them on the basis of more general CS literature.

III.I CHALLENGES FOR CS RATINGS III.I.I. LACK OF STANDARDIZATION
Although CS ratings have spread, little standardization has been achieved.This is the result  of the varying interests and perceptions that raters and stakeholders have in terms of CS.Beyond that, even those ratings that actually do address the same issues and interests apply varying measures and use their own methodology (Sadowski, Whitaker, Lee, and Ayars).The competing approaches have rarely been evaluated in academic research so far, although this is regarded as crucial for the construction of ratings (Chatterji, Levine, and Toffel; Sharfman) and indices (Fowler and Hope).Exceptions are for example works by Chatterji and Levine; Chatterji, Levine, and Toffel; Chatterji and Toffel; Knoepfel; and Sharfman.Furthermore, whereas the assessed companies may aim at standardization where possible (econsense), this is not desirable from the stakeholders' point of view because of their different perception of and interest in CS (Beloe, Scherer, and Knoepfel; Dillenburg, Greene, and Erekson; Graafland, Eijffinger, and Smid).Hence, standardization of ratings and the establishment of best practices are unlikely for the time being.Another cause for the lack of rating standardization is company-internal CS accounting and reporting (Schaltegger).Ratings use publicly available information as well as data disclosed by companies.Yet, the ways that companies gather and communicate information are typically very different.Especially the measurement of social issues as well as the evaluation of the influence of CS on companies' success is difficult and not organized systematically.Therefore, the data that ratings build upon is not necessarily comparable and quality might differ.This fact can distort the rating result.

III.I.II. LACK OF CREDIBILITY OF INFORMATION
In order to assess CS, ratings depend on suitable information.As already discussed earlier, there is a significant lack of data availability.Thus, besides publicly available data (like company or media reports), raters at least partially depend on self-disclosure of companies.A lot of companies acknowledge the signaling function of ratings and take part in surveys (Dillenburg, Greene, and Erekson; Fowler and Hope; Schäfer, Beer, Zenker, and Fernandes), for example through investor relations departments which communicate with analysts and investors (Healy and Palepu).For instance, inclusion in the DJSI requires companies to "fill in a detailed questionnaire covering a wide range of weighted economic, environmental, and social factors" (Fowler and Hope).
Yet, the credibility of company information may be questioned, "[b]ecause managers have incentives to make self-serving voluntary disclosures" that will not negatively affect their competitive position (Healy and Palepu 425; see also Laufer).That is one reason why many rating organizations use additional publicly available information to verify data (Beloe, Scherer, and Knoepfel).For example, EIRIS refers to the information of "government and regulatory agencies, industry organizations, trade publications, campaigning bodies, academic and specialists' reports, and the output of other research bodies" (Schäfer, Beer, Zenker, and Fernandes 72).However, this information does not necessarily have to be credible either.The verification of information remains a "significant challenge" for research organizations (Beloe, Scherer, and Knoepfel 29; see also Laufer; Ramus and Montiel).
Additionally, Beloe, Scherer, and Knoepfel (29) observe that companies are still "by far the most important source of information" for research organizations.SAM states that their company questionnaire is "the most important source of information for the assessment" leading to the Dow Jones Sustainability Index (DJSI) (SAM Indexes GmbH).EIRIS declares that their survey serves to provide "the most recent and accurate information available."During the oekom rating procedure "considerable importance" is attached to the cooperation with companies (oekom research, oekom Corporate Rating).Despite the inclusion of additional information and the fact that many rating organizations today fill in large parts of the questionnaires based on public data themselves (Beloe, Scherer, and Knoepfel), these examples demonstrate that companies are to some extent still able to influence rating results.
Another important argument for the increased inclusion of publicly available data is 'questionnaire fatigue' resulting from the intensive surveying of companies (Beloe, Scherer, and Knoepfel; Chatterji and Levine; econsense).Companies have to spend considerable resources to take part in surveys and to interact with research organizations (Fowler and Hope, Chatterji and Levine).Besides the increasing unwillingness to participate in surveys, another possible negative sideeffect can be that inexperienced employees like interns accomplish the rating survey process.This questions the credibility of information even more (Hansen).

III.I.III BIAS
Another challenging aspect for CS ratings are biases.Schäfer, Beer, Zenker, and Fernandes state that many CS ratings are biased, meaning that they put special emphasis either on the environmental, social, or economic dimension.However, overemphasizing either one of the three dimensions is inconsistent with the integrative character of CS.According to that, companies are required to simultaneously take account of and harmonize the environmental, social, and economic dimension (Schaltegger and Burritt).The particular economic bias is especially strong in conventional ratings that use only selective CS measures as add-on.However, the same bias exists in well-established assessment approaches like the DJSI, and thus, SAM's rating (Fowler and Hope).Fowler and Hope find that SAM does not consider the three dimensions of sustainability in a balanced way.SAM's assessment aims at identifying industryspecific best in class companies and focuses on those that are "most likely to turn sustainability into shareholder value" (Schäfer, Beer, Zenker, and Fernandes 101).Accordingly, social and environmental criteria weigh less than economic ones (Fowler and Hope).This also applies to KLD Research and Analytics, Inc. (now part of MSCI Inc.) whose declared objective is to serve investors (Chatterji and Toffel).Dillenburg, Greene, and Erekson (169) describe the consideration of social criteria in the assessment of large investment firms as "just a collateral service."This undifferentiated approach is criticized by many authors who highlight that ratings should be suitable for various stakeholders with different interests (Beloe, Scherer, and Knoepfel; Dillenburg, Greene, and Erekson; Graafland, Eijffinger, and Smid).
In contrast, special interest ratings may put more emphasis on ethical (or normative) and/ or environmental issues while neglecting other dimensions.One example is the sustainability analysis of the Calvert Social Index, in which social and ethical aspects are analyzed in more detail than environmental aspects (Calvert Group, Ltd.; Schäfer, Beer, Zenker, and Fernandes).
Biases are also relevant for the type of companies to be rated.A lot of ratings, rankings, and indices aim at identifying sustainability leaders, for instance the DJSI.However, most ratings focus on larger companies and include neither small and medium enterprises nor companies from emerging countries (Beloe, Scherer, and Knoepfel; Fowler and Hope; Schäfer, Beer, Zenker, and Fernandes).Consequently, sustainability leaders may not be identified by this procedure, since the raters possibly do not even include them in the sample (Fowler and Hope) or they do not take part in the rating (selfselection bias) (Finch).Another difference in the selection process is the usage of an existing index as "underlying universe" versus actively screening for sustainability-oriented companies.For example, the Dow Jones Indexes (DJI) serve as parent indices for the DJSI (SAM Indexes GmbH) and several MSCI indices for the MSCI ESG Indices (MSCI Inc.), whereas the oekom universe also contains smaller companies and "significant non-listed bond issuers" (oekom research, oekom universe).

III.I.IV. TRADEOFFS
Closely connected to biases are tradeoffs.Most ratings ultimately aim at producing one single score that is a number or letter as result of the rating process.For example, oekom's rating uses categories between A+ and D-(oekom research, oekom Corporate Rating), and SAM's rating works with percentages (SAM and PwC).Expressing the performance of companies in such a simple way makes it easy to understand companies' positions and to compare them (Graafland, Eijffinger, and Smid).Nonetheless, when creating a single score of the individual measures across the triple bottom line, raters assume that "values can be reduced to one dimension" (Graafland, Eijffinger, and Smid 151) although they are "pluralistic in nature" (Graafland, Eijffinger, and Smid 140).Aiming at one single score means that shortcomings in one dimension may be compensated by a better performance in another (Delmas and Doctori-Blass).Hence, single scores probably result in a distorted picture of the actual sustainability performance of a company because it is hardly taking into account all facets of CS.Companies are required to embed sustainability management in conventional management instead of dealing with it in parallel.This implies that CS has to be linked to the strategy, core business, and day-to-day processes in all organizational units (Stubbs and Cocklin).This integration challenge complicates the assessment of CS, since activities, outcomes, and budgets are the more difficult to identify as sustainability-oriented the better they are integrated.One single score is hardly able to reflect these interdependencies properly.Furthermore, CS is not a state to be reached (de Ron; Epstein; Schaltegger and Burritt).Instead, the concept occupies the demand for continuous improvement which shows its process character.Hence, an evaluation of CS should be carried out in relative terms and requires the comparison to a benchmark.One single score can only accomplish this by relating to other scores, for example of other companies or earlier ratings of the same company.Graafland, Eijffinger, and Smid even demand not to conduct cross-sector benchmarking but to limit comparisons to one industry.In fact, rating results often consist of an additional comparative score.For example, SAM translates sustainability scores into a relative industry measure (SAM and PwC).Vigeo and Forum Ethibel state in their rulebook on the Ethibel Sustainability Indices that they intentionally do not calculate a global company score or compile a ranking based on the results of the individual research fields.Still, especially rankings normally oversimplify CS assessment.

III.I.V. LACK OF TRANSPARENCY
When discussing the lack of transparency it has to be pointed out positively that most of the criteria accounted for in ratings are not determined by the raters alone but together with third parties like NGOs or academia.This first step in the direction of "tripartism" (Laufer 259) serves to ensure that ratings are more balanced and accepted and increases transparency and accountability (Fowler and Hope).Nonetheless, the research components leading to rating results are rarely made fully available, sometimes except for key clients (Beloe, Scherer, and Knoepfel).This refers to the way information is collected, the methodology, assumptions, calculations, weightings, threshold values, and the specific criteria of the analysis (Beloe, Scherer, and

III.I.VI. LACK OF INDEPENDENCE
The relationship between companies and raters established in order to get the necessary information raises the question whether ratings are independent.Research organizations increasingly depend on the personal interaction with companies (Beloe, Scherer, and Knoepfel).This is especially true when the rating process is carried out repeatedly over time, which is usually the case.For example, oekom emphasizes the importance of the cooperation with companies during their rating (oekom research, oekom Corporate Rating) and SAM describes to "proactively engage with companies" (SAM and PwC 21).
The close relationship to companies might call for even more criticism in cases where ratings are conducted by financial service providers which already have or intend to establish further business relations with the companies (e. g., consultancy, financial analysis, or mandated risks assessments) (AI CSRR; Beloe, Scherer, and Knoepfel).These aspects might create conflicts of interest.They are discussed in the European Corporate Sustainability and Responsibility Research Quality Standard (CSRR-QS), a quality standard for CS and SRI research (see www.csrr-qs.org).Another potential conflict brought up by Healy and Palepu is the personal interest of financial analysts in screening outcomes: "analysts are rewarded for providing information that generates trading volume and investment banking fees for their brokerage houses" (Healy & Palepu 417).This may encourage upward biases of rating results.
One more relevant aspect in this context is the distinction between solicited and unsolicited ratings.Solicited ratings are carried out for a particular client and paid for (Finch).This fact also puts into question the independence of the ratings.So far the paper has identified six important challenges that come along with CS ratings.Of course, more challenges can be found in the literature, for example in the "Rate the Raters" publications (Sadowski, Whitaker, and Buckingham, Rate the Raters Phase One) or from a philosophical point of view (Graafland, Eijffinger, and Smid).Still, the six challenges described here together form the most prominently discussed aspects.In the following, the paper analyzes the causes of these challenges and suggests ways to tackle them.

III.II. WHAT ARE THE CAUSES OF THE IDENTIFIED CHALLENGES?
The six challenges that CS ratings face have been identified as lack of standardization, lack of credibility of information, bias, tradeoffs, lack of transparency, and lack of independence.In the following, the paper discusses the causes of these challenges based on general literature on CS and CS assessment.

III.II.I. LACK OF RATING STANDARDIZATION AND THE COMPLEXITY OF CS
The lack of rating standardization is not only the outcome of the competitive market for ratings but also the result of the complexity of CS.Even if there were a commonly accepted definition of the concept, it would still be highly complex.However, research and practice have widely agreed upon the triple bottom line approach requiring the mutual consideration of environmental, social, and economic aspects (Elkington).According to this approach, CS comprises a contribution to sustainable development of companies on the one hand and to the environment, society, and economy on the other (Loew, Ankele, Braun, and Clausen; Schaltegger and Burritt).CS therefore has to be assessed not only with regard to its various constituent parts, but also to long-term or rebound effects and further interdependencies (Stahlmann and Clausen; Wiedmann, Lenzen, and Barrett).Furthermore, the results of CS cannot be traced by "focusing on what goes on within the factory fences, farm gates, or company premises" (Wiedmann, Lenzen, and Barrett 362).CS typically crosses companies' boundaries, which implies that their sustainability performance is not only to be assessed in terms of internal measures but also of "impact" (Epstein; Wiedmann, Lenzen, and Barrett).
Assessment on the impact level is dealt with more closely for example in development agencies, and despite those agencies' long experience it remains a complex issue (Roche).The consequence is that companies' sustainability performance is very difficult to assess (Graafland, Eijffinger, and Smid).That is why a large variety of internal and external approaches exist that deal differently with the assessment of CS.
Of course, this applies for ratings and their varying methodologies, too, and makes standardization efforts like the CSRR-QS (AI CSSR) or SustainAbility's "Rate the Raters" research program (Sadowski, Whitaker, Lee, and Ayars) necessary.Accordingly, missing standardization does not only affect ratings but all CS assessment approaches since it results from the concept of CS itself.

III.II.II. LACK OF CREDIBILITY OF RATING INFORMATION AND THE LACK OF DATA AVAILABILITY
The question of credibility of the information that ratings use and offer is directly related to the lack of CS data availability.This problem affects internal as well as external CS assessment.Whereas internally the major problems are mostly matters of knowledge, information systems, and other management tools (Schaltegger), externally the question is rather one of limited data access.Most of the information required by ratings, if collected at all, is sensitive and rarely made publicly available (Lyon and Maxwell).Thus, not only rating organizations but all providers of CS assessments depend on self-disclosure of companies in addition to publicly available data.Therefore, suitable internal assessment is indispensable for the accomplishment of external assessment (Chatterji and Levine).Furthermore, due to the complexity of CS the question remains which data to measure.Accordingly, the lack of credibility of information results from the lack of CS data and therefore affects every CS assessment.

III.II.III. RATING BIAS AND THE FINANCIAL BACKGROUND OF RATINGS' USERS
Another aspect is the bias of ratings.As already described, the emphasis on economic issues is a result of the increasing interest of conventional analysts in sustainability.These actors probably have only little interest in the mutual consideration and integration of the economic, environmental, and social dimension because of their finance-oriented background.Investor-focused ratings rather regard environmental and social issues as add-on.
Other CS assessment approaches may face different biases.For example, organic food labels and consumer-focused ratings may mainly consider environmental aspects.Thus, biases opposing the integrative assessment of CS are a challenge that other assessment approaches have to face alike.Still, the bias to the financial dimension is a problem that affects ratings in particular because of their use within the financial market and their stakeholders' demands.

III.II.IV. RATING TRADEOFFS AND THE DEMAND OF RATINGS' USERS
Tradeoffs also result from the demands of ratings' users.Most ratings are designed to primarily fulfill the needs of their main users, investors, who focus on traditional financial analysis (Beloe, Scherer, and Knoepfel; Delmas and Doctori-Blass; Dillenburg, Greene, and Erekson; econsense).Presenting the rating results in form of single scores makes them easy to compare and communicate, and thus, suitable for investment decisions.Additionally, many ratings also serve for rankings and indices which makes it inevitable to have a single, comparable figure.Beyond that, the communication of the results of CS assessments in a comprehensive, and at the same time, complete manner is challenging for other approaches, too.

III.II.V. LACK OF RATING TRANSPARENCY AND THEIR COMMERCIAL USE
A widely discussed challenge for ratings is their lack of transparency.When rating organizations do not disclose their methodology, weightings, etc., stakeholders cannot tell what it is that they measure.As long as ratings lack transparency, their credibility and reliability may be questioned just like the reliability of the companies to be examined.This particular challenge results primarily from the young, dynamic, and competitive rating market and the aim to maintain commercial advantage (Beloe, Scherer, and Knoepfel; econsense).Since it can be expected that only a few "winners" will remain in the market (Sadowski, Whitaker, Lee, and Ayars 5), raters try to generate and maintain unique selling propositions, and undisclosed methodologies are hard to imitate.However, it has to be pointed out that some rating organizations are already more transparent than others.For example, Beloe, Scherer, and Knoepfel refer to Ethibel, SAM Research, and Vigeo as best practice organizations, and Sadowski, Whitaker, and Buckingham (Rate the Raters.Phase One) point to Corporate Knights Inc.Furthermore, transparency does not only affect ratings, but is also discussed with regard to other "quality assurances and the substantiation of socially relevant claims" (de Boer 261), for instance certification processes for labels and audits (de Boer; Jahn, Schramm, and Spiller; Müller) .

III.II.VI. LACK OF RATING INDEPENDENCE AND THE INTERMINGLED BUSINESS OF RATERS
The last aspect is the missing independence of ratings.Contact between raters and companies may be unavoidable, but in order to guarantee an objective assessment the relation should not be closer than necessary.In order to reliably assess CS, rating organizations should especially not have further bonds with companies because that may in the worst case offer an incentive to manipulate rating results.Graafland, Eijffinger, and Smid (139) argue that researchers should carry out the analysis in a "disinterested way."This problem is a matter of governance.As rating organizations often do not only carry out ratings but have intermingled relations to the assessed companies, their independence and objectivity have to be questioned.This aspect is reflected in a recent survey conducted among sustainability experts by Globescan.The survey shows that among different raters, NGOs are most trusted, followed by companies' employees.Rating and ranking organizations come only in the third place, mainstream investors even later.When asked about the trust in particular ratings and rankings, the highest ranked approach, the DJSI, was classified as "highly trusted" by not more than 48 per cent of the respondents (Sadowski, Whitaker, and Buckingham, Rate the Raters.Phase Two).
This lack of belief in the credibility of ratings is incompatible with their purpose to increase transparency and reliably reduce information asymmetries.The situation is comparable to that of certifiers and auditors (Epstein; Finch).Epstein (246) states that "some observers have wondered whether, as with financial auditors, verifiers should act as both consultants and auditors […]."Finch (17) finds that "the provision by auditors of nonaudit advisory services to companies undermines the independence of the audit."In the context of the food market, Jahn, Schramm, and Spiller describe the necessity of reducing auditors' dependency on the companies to be certified with regard to quality labels.The challenge of independence particularly affects organizations or businesses that have further relations to companies.
The six challenges identified and described may have different causes, but combined they diminish the reliability of ratings.Against the background of their causes, the upcoming section discusses possible improvements for each challenge.

IV. WAYS TO IMPROVE CS ASSESSMENT THROUGH RATINGS
In summary, and as Table 3 shows, the identified challenges have different causes and thus have to be tackled differently.Some of the challenges can be ascribed to the concept of cs itself and constitute general challenges when assessing CS (lack of standardization and lack of credibility of information).Furthermore, some challenges for CS ratings result from the financial background and demands of the ratings' users (bias and tradeoffs), whereas other challenges result from the commercial use of ratings and the intermingled business relations of raters (lack of transparency and lack of independence).In the following, recommendations are given to improve the reliability of ratings.

IV.I. GENERAL CHALLENGES WHEN ASSESSING CS
The lack of standardization and the lack of credibility of information of ratings are results of the complexity of CS and the lack of availability of CS data.Meeting these general challenges requires the contribution of various disciplines and actors in research and practice.On the one hand, the concept of CS itself still is hard to grasp.It can be expected and is desirable for the various actors involved to come to an agreement on a basic common definition in the near future.Furthermore, a more precise understanding of CS could be generated within the realm of ratings in particular, ideally in collaboration with third parties to include various perspectives on CS.A common understanding could enable coordinated research like the one of the Sustainable Investment Research International Group (SIRI) (Chatterji and Levine; Schäfer, Beer, Zenker, and Fernandes).This is one way to reduce the large number of ratings, which could positively influence data availability and the credibility of data since fewer inquiries of greater quality would be directed at companies.NGOs and other third parties could furthermore be included in the data generation for external verification.So far, each rating uses their individual measures, which is at least inefficient (Sadowski, Whitaker, Lee, and Ayars).

IV.II. THE FINANCIAL BACKGROUND AND DEMANDS OF RATINGS' USERS
Furthermore, some CS rating challenges result from the interest and demands of ratings' users: bias and tradeoffs.The particular bias towards financial issues and the demand for single, comparable scores in part even oppose the idea of CS.These challenges derive from the expectations of investors, financial analysts, and other ratings' users with financial background.Instead of using CS as addon to conventional ratings, financial markets have to learn and acknowledge its integrative character which entails more balanced assessments than what is common practice.This could be achieved by opening ratings for a wider audience (Sadowski, Whitaker, Lee, and Ayars) and the cooperation with stakeholders, especially NGOs and (potential) customers, which represent the environmental and social dimension of sustainability and thus bring in new perspectives (Laufer).
In the context of the financial market, identifying further Business Cases for Sustainability (Schaltegger and Wagner) might also help to accomplish a shift in the perception of CS from "knock-out criterion" to a more (economically) relevant aspect.Furthermore, it is desirable to enable stakeholders with differing interests to make use of ratings (Sadowski, Whitaker, Lee, and Ayars).Rating results should be offered to stakeholders in a way that enables them to carry out their own evaluation according to their perceptions of and interests in CS.This could be a way to enhance the acceptance of ratings and to promote sustainable development.So far, most ratings, especially those used in the financial market, are not designed to handle this evaluative character of CS.
The same holds true regarding tradeoffs: the publication of detailed information on the calculation of a final score could serve to increase the interest of further stakeholders and to promote the use of ratings.Furthermore, biases in the units of analysis of ratings could be reduced by their extension to small and medium-sized enterprises.

IV.III. THE COMMERCIAL USE OF RATINGS AND THE INTERMINGLED BUSINESS RELATIONS OF RATERS
The lack of independence and the lack of transparency of ratings result from the characteristics of the rating organizations and the commercial use of CS assessment.As the Globecan results show, NGOs are trusted more than rating organizations, possibly because NGOs are less directly trying to make commercial use of CS assessments and because they rarely have further business relations with companies.A possible improvement for the reliability of ratings thus could be the prominent cooperation with one or more NGOs in the rating process (Laufer).However, independence and transparency are also relevant for other CS assessment approaches like audits, certificates, and labels.Similar recommendations apply here, for example consultants should not be auditors at the same time (Epstein).
In order to increase their transparency, rating organizations could furthermore (alone or together with an NGO) disclose their methods, measures, and the content of their surveys.This applies to other assessment approaches like audits and labels, too.A further possibility to increase the reliability of ratings is to make use of independent assurance to verify commitments, ideally with an NGO due to their higher credibility (Laufer; Ramus and Montiel).Additionally, in order to provide reliable information and to enhance their credibility, rating organizations could, at least, disclose potential conflicts and how they are handled.At best, of course, those conflicts should be avoided and analysts completely independent.This applies for other intermediaries carrying out audits or assessments, too, be it on the general capital market (Healy and Palepu) or regarding CS in particular.Besides self-imposed principles, the establishment of standards, such as the CSRR-QS (AI CSRR), might help to increase trust in those research organizations.Further research in this area should be a sound combination of practice demands and theoretical contributions.
Table 3 offers a summary of the aspects discussed in this part.

V. CONCLUSION
Fostering sustainable development and CS in particular depends on suitable CS assessment approaches.The paper has shown that ratings, on the one hand, are a practice-relevant approach to assess CS externally.On the other hand, several characteristics of ratings are criticized in research and practice.This paper served to assemble and systematize the main rating challenges described in the literature: lack of standardization, lack of credibility of information, bias, tradeoffs, lack of transparency, and lack of independence.An analysis of these challenges reveals that they have different causes.Some general challenges when assessing CS result from the concept of CS itself (lack of standardization and lack of credibility of information).Other challenges result from the demand side of ratings and show the financial background and demands of the ratings' users (bias and tradeoffs).Last but not least, some challenges result from the supply side of ratings, namely the commercial use of ratings and the intermingled business relations of raters (lack of transparency and lack of independence).They also affect other CS assessment approaches like audits and labels.Improving the reliability of CS ratings is relevant, since they fulfill an important function with regard to Especially those challenges resulting from the supplier side of ratings (see 4.3) should be tackled proactively in order to increase the reliability and acceptance of ratings as CS assessment approach.
Overcoming CS assessment hurdles can be achieved by several first improvements suggested in this paper.But, due to the interdisciplinary character of CS, these problems cannot be entirely solved by one actor, like raters, but require further research and contributions from several disciplines in research and practice.CS assessment is a process in its own right -just like CS itself.

Table 2 :
Challenges for ratings assessing CS.

Table 3 :
Rating challenges, causes, and possible improvementsovercoming the information asymmetry in the context of CS.Beyond that, ratings are able to positively influence companies' sustainability efforts, foster the institutionalization of information management, and stimulate competition between companies (Chatterji and Levine; Dillenburg, Greene, and Erekson; Fowler and Hope; Graafland, Eijffinger, and Smid).And despite the somewhat negative effects that it may have on the understanding of CS, "[t]he financial industry is in a unique position to move corporations towards corporate sustainability" (Delmas and Doctori-Blass 245).What is needed now is a "second generation" of ratings and related research (Beloe, Scherer, and Knoepfel 3) including NGOs and thereby other perspectives(Laufer).