Security Metametrics: June 2013

26 June 2013

SMotW #63: infosec budget variance

Security Metric of the Week #63: information security budget variance

This is, self-evidently, a financial information security metric but what exactly is "Information security budget variance"? Now there's the rub.

You might interpret it as a measure of the discrepancy between budgeted, permitted, authorized or allocated funds for information security and actual expenditure. The illustrative graph above is a view of Acme Enterprise's information security budget variance on this basis over the course of a year, showing actual relative to predicted security expenditure (the zero dollar horizontal axis representing the budgeted spend). Things are looking pretty grim for the first quarter but gradually improve as (presumably) firm action is taken to correct the overspend. It looks as if there might even be a small surplus at the end of the year, perhaps enough to afford some discretionary expenditure such as a boost to the security awareness and training budget, or maybe a management away-day to work on the organization's security metrics! This is an example of a management metric that would be valued by the CISO or Information Security Manager, and may be of some concern to higher and lower strata.

Alternatively, you might believe it refers to changes in the information security budget from year to year. For example, a budget that has remained static for years, despite the ever-increasing number and severity of security incidents plus a growing burden of regulatory compliance, might be used to justify a significant increase in the security budget next year. This would be a strategic metric with a comparatively long timeline, of greatest interest to senior/executive management, the CISO and the CFO.

Acme managers might use the PRAGMATIC scores for these two quite different metrics to assess their worth and decide whether to use neither, either or both of them, depending on what other metrics options are on the table. No doubt in the course of considering the PRAGMATIC ratings, Acme management would think of possible drawbacks or issues (such as the practical difficulty of accurately measuring the total organization-wide expenditure on information security, which far exceeds the Information Security Management Department's budget) and perhaps come up with refinements (such as consider the benefits as well as the costs) to improve their scores.

At a more basic level, different Acme managers might unknowingly start out with distinct perspectives and objectives for the metric titled "Information security budget variance", differences that would come to a head almost as soon as the PRAGMATIC process kicked-off. Better now than later when "Information security budget variance" lands up in some management report somewhere, and the recipients interpret the metric in radically different ways, without even appreciating that their interpretations differ!

25 June 2013

Cart << horse

When we first met and started discussing information security metrics, Krag and I soon realized we shared the view that there are loads of possible metrics out there. Anyone out shopping for security metrics is spoiled for choice, facing a bewildering array of things they could measure. Far from being short of possible metrics, we face the opposite problem, choosing which of the plethora of metrics on offer to go with.

Most security metrics people propose or recommend specific metrics. The better ones at least make the effort to explain what the metrics are about, and a few take the trouble to justify their choices. Here's a single example, a list of over 40 metrics recommended by Ben Sapiro on the LiquidMatrix blog:

Time to patch; time to detect; time to respond; system currency; time to currency; population size; vulnerability resilience/vulnerable population; average vulnerabilities per host; vulnerability growth rate versus review rate; infection spread rates; matched detection; unknown binaries; failure rates; restart rate; configuration mismatch rate; configuration mismatch density; average password age and length; directory health; directory change rate; time to clear quarantine; access error rates per logged in user; groups per user; tickets per user; access changes per user; new web sites visited; connections to botnet C&C’s; downloads and uploads per user; transaction rates; unapproved or rejected transactions; email attachment rates; email rejection/bounce rates; email block rates; log-in velocity and log-in failures per user; application errors; new connections; dormant systems; projects and without security approval; changes without security approval; average security dollars per project; hours per security solution; hours on response; lines of code committed versus reviewed; and application vulnerability velocity.

That's not a bad list, as it happens, of readily-automated technical/IT security metrics. Ben briefly explains each one, averaging about 30 words per metric. He writes well and manages to squeeze quite a lot of meaning into those 30-odd words, hinting at what the metric really tells you, but inevitably there is far more left unsaid than said - not least, there's the issue of what other metrics Ben may have considered and rejected when compiling his shortlist, and on what basis he chose those 40+ metrics.

If you're not yet convinced, sir, try on these lists, catalogs and sources of security metrics for size: CIS, OWASP, NIST, MetricsCenter, nCircle, ProjectQuant, ThirdDefense ... I could go on, but I'll leave the last word to Debra Herrmann's remarkable Complete Guide to Security and Privacy Metrics, all 800+ pages of it.

It's a bit like a child being spoon-fed medicine. "Here, take this, it's good for you". It's the "Trust me" approach favored by vendors pushing complex technical products on an ignorant, naive or dubious market. To put that another way, there is a strong tendency for metrics proponents to offer solutions (often their pet metrics) without taking the trouble to understand the problems. Worse still, most are implicitly framing or bounding the problem space as a technical rather than a business issue by restricting the discussion to technical metrics derived from technical data sources.

What makes a given metric a good or a bad choice? On the whole, the existing body of research on this topic failed to address this relatively straightforward issue well enough to offer usable, practical advice to busy CISOs, ISMs, ITSMs, risk managers and executives grappling with information security issues. Whereas Andrew Jaquith, Dan Geer, Lance Hayden and others have tackled various parts of the issue, each in their own way, there was definitely something lacking. In particular, we noticed a strong tendency to focus on automated, technical metrics i.e. the statistics spewed forth by most security systems, the logical extreme being SIEM (an expensive technical solution ... for what business problem, exactly?).

We wrote about this at some length in PRAGMATIC Security Metrics. Chapter 5 leads you on a voyage of discovery through a multitude of sources of candidate metrics, while chapter 6 lays out the PRAGMATIC criteria and method for honing a long list down to a short one, while figuring out the problems that your metrics are hopefully going to solve. If you know what questions have to be answered, you know what information you need, hence the metrics all but choose themselves.

20 June 2013

More security metrics from another vendor survey

A website security survey by White Hat Security makes the point that 'a comprehensive metrics program' is valuable:

"The tactical key to improving a web security program is having a comprehensive metrics program in place – a system capable of performing ongoing measurement of the security posture of production systems, exactly where the proverbial rubber meets the road. Doing so provides direct visibility into which areas of the SDLC program are doing well and which ones need improvement. Failure to measure and understand where an SDLC program is deficient before taking action is a guaranteed way to waste time and money - both of which are always extremely limited."

Naturally, we agree with them that a 'comprehensive metrics system' (whatever that might be) is A Good Thing ... but it's not entirely clear to me how they reached that particular conclusion from the survey data. Worse still, the survey design begs serious questions, like for example whether 79 respondents is sufficient to generate statistically meaningful data, how those 79 respondents (and presumably not others) were selected, and exactly what they were asked ...

If you've been following our series about the Hannover/Tripwire survey (the introduction followed by parts one, two, three, four and five) this is an opportunity to think through the same kind of issues in the context of another vendor-sponsored survey.

Once again, I'd like to point out that I'm not saying such reports are worthless, rather that you need to read them carefully to counteract their natural bias. It's a rare vendor-sponsored survey that doesn't have an agenda and/or serious flaws in the methodology, analysis and reporting. Recognizing that is half the battle.

To be fair to White Hat Security, the report does outline some of their methods towards the end, mostly relating to their commercial website security assessment service, although the survey of 79 respondents is not well described.

Personally, I enjoy reading surveys to find out which metrics the authors have chosen to measure their chosen subjects, to learn both good and bad practices concerning experimental design etc., and to grab the odd soundbyte such as the paragraph above (quoted out of context, I admit) for my own biased purposes. Vendor-sponsored studies may or may not be scientifically sound, but so long as they make us think about the underlying issues, that's better than nothing, isn't it?

Regards,
Gary (Gary@isect.com)

SMotW #62: security policy management maturity

Security Metric of the Week #62: security policy management maturity

As with the other ‘maturity metric’ examples given in the book (e.g. those for asset management, physical security, HR, business continuity and compliance) we envisage this metric as a scoring scale using predefined criteria against which the organization's security policy management practices are assessed and rated.

Here's the first of four rows from the example policy maturity metric in Appendix H:

0%: no information security policy management	33%: basic information security policy management	67%: reasonable information security policy management	100%: excellent information security policy management
There is nothing even remotely resembling a security policy as such	There is a security policy of sorts, although probably of poor quality (e.g. badly worded or inconsistent), incomplete and/or out-of-date with some elements undocumented	The information security policy is documented, reasonably complete, accurate and up-to-date, reflecting most corporate and external obligations, albeit somewhat stilted and difficult to read and apply in places and perhaps with limited coverage in topical issues such as cloud computing	The information security policy materials are formalized, entirely complete, accurate, up-to-date, consistent and readable, explicitly reflecting a documented set of high level security principles, fully reflecting all corporate and external obligations and promoting generally accepted good security practices

On each row, there are four scoring criteria denoting scores of 0, 33, 67 and 100 points on the percentage scale. There is also a fifth, implied point: 50% marks the boundary between unacceptable (scores less than 50%) and acceptable (scores greater than 50%).

The scoring criteria are written in order to give the assessor a good steer for the kinds of information security policy management practices to look out for at each maturity level, yet these are merely examples rather than firm requirements. For instance, the 33% scoring point on this row clearly refers to the presence of something resembling a 'security policy' (in marked contrast to the 0% point), but calls into question the quality and status of the document (again, distinguishing it from the higher scoring points). If that is a fairly accurate description of the situation, the assessor can simply award a score of 33% for that row and move on to the next, but he/she has the discretion to award slightly higher or lower scores to reflect the unique way that the organization manages its security policies. This allows some leeway to acknowledge strengths and weaknesses that may not be shown in the scoring criteria, or that may appear at different points on the scoring scale (e.g. if the security policy is formally documented but the quality of the document is poor, it might merit a score of say 40 or 50%).

Although this was not the top-scoring policy metric, it is clear from the metric's PRAGMATIC score that Acme's management were impressed with this one:

P	R	A	G	M	A	T	I	C	Score
90	95	70	80	88	85	90	82	88	85%

The scoring process and/or the Meaning of the final score may need to be explained when the metric is reported, for instance highlighting particular rows in the table against which the organization scored relatively strongly or weakly to demonstrate how the final score was determined. Doing so would be an opportunity to address the Actionability issue, since the detailed findings indicate particular things that Acme could be doing to improve its maturity score.

By the way, the very act of drawing up or refining the scoring criteria used in maturity metrics like this is itself a sign of maturity in the organization’s approach to security metrics. It takes some thought and effort to prepare the criteria, including research into good practices. Gray-beard IT auditors or information security management professionals have generally experienced a wide variety of good and bad practices in past assignments, while there is plenty more advice in information security standards and methods concerning the kinds of things that the organization ought to be doing.

17 June 2013

Hannover/Tripwire metrics final part 5 of 5

So far in this series of bloggings, I have critiqued the top five metrics identified in the Hannover Research/Tripwire CISO Pulse/Insight Survey. I'll end this series now with a quick look at the remaining six metrics and an overall conclusion.

Metric 6: "Legitimate e-mail traffic analysis"

While the analysis might conceivably be interesting, isn't the metric the output or result of that analysis rather than the analysis itself? I'm also puzzled at the reference to 'legitimate' in the metric, since a lot hinges on the interpretation of the word. Is spam legitimate? Are personal emails on the corporate email system legitimate? Where do you draw the line? Working on the assumption that this metric, like the rest, is within the context of a vulnerability scanner system, perhaps the metric involves automatically characterizing and categorizing email traffic, then generating statistics. Without more information, the metric is Meaningless.

Metric 7: "Password strength"

This could conceivably be a fairly sophisticated metric that takes into account a wide variety of characteristics of passwords (such as length, complexity, character set, character mix, predictability, quality of the hashing algorithm, time since last changed, relationship to known or readily guessed factors relevant to the users, relationship to users' privilege levels or data access rights and so on) across multiple systems. More often, it is a much simpler, cruder measure such as the length of an individual password at the point it is being entered by a user, or the minimum password length parameter for servers or applications. Both forms have their uses, but again without further information, we don't know for sure what the metric is about.

Metric 8: "Time to incident recovery" and metric 9: "Time to incident discovery"

These metrics concern different parts of the incident management process. At face value, they are simple timing measures but in practice it's not always easy to determine the precise points in time when the clock starts and stops for each one.
Metric 8 implies that incidents are recovered (not all are), and that the recovery is completed (likewise). If metric 8 were used in earnest, it would inevitably put pressure to close-off incidents as early as possible, perhaps before the recovery activities and testing had in fact been finished. This could therefore prove counterproductive.
Metric 9 hinges on identifying when incidents occurred (often hard to ascertain without forensic investigation) and when they were discovered (which may coincide with the time they were reported but is usually earlier). The metric is likely to be subjective unless a lot of effort is put into defining the timepoints. The tendency would be to delay the starting of the timer (e.g. by arbitrarily deciding that an incident only counts if the business is impacted, and the time of that impact is the time of the incident), and to stop the timer as early as possible (e.g. by making presumptions about the point at which someone may have first 'spotted something wrong'). The accuracy and objectivity of the metric could be improved by more thorough investigation of the timing points, but that would increase the Costs at least as much as the benefits.

Metric 10: "Patch latency"

On the assumption that this is some measure of the time lag between release of [security relevant] patches and their installation, this could be a useful metric to drive improvements in the efficiency of the patching process provided care is taken to avoid anyone unduly short-cutting the process of assessing and testing patches before releasing them to production. Premature or delayed implementation could both harm security, implying that there is an ideal time to implement a given patch. Unfortunately, it's hard to ascertain when the time is just right as it involves a complex determination of the risks, which vary with each patch and situation (e.g. it may be ideal to implement patches immediately on test or development systems, but most should be delayed on production systems, especially business-critical production systems).

Metric 11: "Information security budget as a % of IT budget"

This is, quite rightly in my opinion, the least popular metric among survey respondents.
It presumes that security and IT budgets are or should be linked. That argument would be stronger if we were talking about IT security, but information security involves much more than IT e.g. physical security of the office.
In reality, there are many factors determining the ideal budget for information security, the IT budget being one of the least important.

Concluding the series

A few of the metrics in the Hannover Research/Tripwire CISO Pulse/Insight Survey only make much sense in the narrow context of measuring the performance of a vulnerability scanner, betraying a distinct bias in the survey. Others are more broadly applicable to IT or information security, although their PRAGMATIC scores are mediocre at best. Admittedly I have been quite critical in my analysis and no doubt there are situations in which some of the metrics might be worth the effort. However, it's really not hard to think of much better security metrics - just look back through the Security Metrics of the Week in this blog, for instance, or browse the book for lots more examples. Better still, open your eyes and ears: there's a world of possibilities out there, and no reason at all to restrict your thinking to these 11 metrics.

If you missed the previous bloggings in this series, it's not too late to read the introduction and parts one, two, three and four.

12 June 2013

SMoTW #61: % of policies linked to objectives

Security Metric of the Week #61: proportion of information security policy statements unambiguously linked to control objectives

Measuring is one way to reinforce the linkage between policy statements and higher level control objectives or axioms. Policies that bear no relation to control objectives/axioms beg the question: what are they meant to achieve? How will the organization determine whether they are effective if the intended outcome is uncertain? What is the justification for compliance with the policy, and what are the implications of low compliance?

Conversely, a strong security policy with a specific, legitimate purpose that cannot be linked to a control objective or axiom implies the need to fill a gap in the high-level control framework.

PRAGMATIC ratings:

P	R	A	G	M	A	T	I	C	Score
92	91	64	60	85	65	45	75	75	72%

"Unambiguously linked" leaves some wiggle room for subjective interpretation, while reviewing and assessing the linkages across the entire policy suite will inevitably take some Time to achieve.

72% is a pretty good PRAGMATIC score, making this a metric well worth considering unless there are other even-higher-scoring metrics that would achieve the same ends more effectively and efficiently. If ACME Enterprises Inc. had identified concerns in relation to their policy coverage, this metric may be just the ticket to drive a policy review and improvement project, and perhaps it might be reported every year or two thereafter as an assurance measure. You could say that he process and the metric need each other.

10 June 2013

The yin and yang of metrics

Many aspects of information security that would be good to measure are quite complex. There are often numerous factors involved, and various facets of concern. Take ‘security culture’ for example: it is fairly straightforward to measure employees’ knowledge of and attitudes towards information security using a survey approach, and that is a useful metric in its own right. It becomes more valuable if we broaden the scope to compare and contrast different parts of the organization, using the same survey approach and the same survey data but analyzing the numbers in more depth. We might discover, for instance, that one business unit or department has a very strong security culture, whereas another is relatively weak. Perhaps we can learn something useful from the former and apply it to the latter. This is what we mean by ‘rich’ metrics. Basically, it involves teasing out the relevant factors and getting as much useful information as we can from individual metrics, analyzing and presenting the data in ways that facilitate and suggest security improvements.

‘Complementary’ metrics, on the other hand, are sets of distinct but related metrics that, together, give us greater insight than any individual metric taken in isolation. Returning to the security culture example, we might supplement the employee cultural survey with metrics concerning security awareness and training activities, and compliance metrics that measure actual behaviors in the workplace. These measure the same problem space from different angles, helping us figure out why things are the way they are.

Complementary metrics are also useful in relation to critical controls, where control failure would be disastrous. If we are utterly reliant on a single metric, even a rich metric, to determine the status of the control, we are introducing another single point of failure. And, yes, metrics do sometimes fail. An obvious solution (once you appreciate the issue, that is!) is to make the both the controls and the metrics more resilient and trustworthy, for instance through redundancy. Instead of depending on, say, a single technical vulnerability scanner tool to tell us how well we are doing on security patching, we might use scanners from different vendors, comparing the outputs for discrepancies. We could also measure patching status by a totally different approach, such as patch latency or half-life (the time taken from the moment a patch is released to apply it successfully to half of the applicable population of systems), or a maturity metric looking at the overall quality of our patching activities, or metrics derived from penetration testing. Even if the vulnerability scanner metric is nicely in the green zone, an amber or red indication from one of the complementary metrics should raise serious questions, hopefully in good time to avert disaster.

A natural extension of this concept would be to design an entire suite of security metrics using a systems engineering approach. We expand on this idea in the book, describing an information security measurement system as an essential component of, and natural complement to, an effective information security management system.

09 June 2013

Hannover/Tripwire metrics part 4

Today we're looking at the next two security metrics from the Hannover Research/Tripwire CISO Pulse/Insight Survey.

The fourth most popular metric, "Platform compliance scores", isn't giving much away with just those three words to go on. What platforms? Compliant with what? And how are they scored? The fact that the survey was sponsored by Tripwire is a massive clue. The surveyors and respondents were presumably thinking on the basis of using a technical tool to assess various IT system configuration parameters against something - possibly comprehensive corporate technical security standards, more likely recommendations from the tool and operating system vendors. Such tools typically generate relatively simple metrics: counts of systems checked, non-compliance issues found and so forth. They often classify issues by severity, although how they do so is uncertain ... This all sounds fantastic, but how does the metric stack up in PRAGMATIC terms? It concerns technical vulnerabilities, which are somewhat Predictive of technical security incidents, which are directly Relevant to IT security and partly relevant to information security. The numbers are probably Genuine unless someone manipulates the assessment criteria and standards, which is possible since the people reporting the metric are the same people with the technical knowledge and access (i.e. the metric has limited Independence). The metric is Meaningful to those same people, particularly if there is a more detailed breakdown by severity, but less meaningful to general management. Accuracy is an issue since some vulnerabilities cannot be readily measured by a tool, while others may be found and reported even if they are not exposed to potential threats, or if exploitation causes no materials impact (i.e. the metric does not take in the bigger risk picture). Gathering and reporting the metric with an automated tool should be Timely, although slower human involvement is beneficial to interpret the raw numbers, present them and use them. Overall, the metric is quite Cost-effective as an IT or technical security measure.

"Baseline defenses coverage" was the fifth most popular metric. Again. it's hard to be sure what the survey's authors and respondents understood by those three words. Presumably the metric is getting at a technical security baseline standard defining a suite of technical security configuration parameters etc. to provide a basic level of platform security. Coverage could refer to the proportion of applicable systems that have fully implemented the baseline, and/or the proportion of security issues that are duly covered in the baseline standard. This ambiguity immediately affects the metric's Meaning, as well as its Genuineness and Accuracy. This is another narrow, technical metric that may have some Predictive value and Relevance to the IT security professionals looking after system and network security. Like the previous one, it is Actionable in the sense that low compliance levels and/or coverage obviously implies the need to increase compliance and/or coverage, but the only some of the necessary details of what actually needs to be done are buried in the raw data.

By now, we're forming the distinct impression that the metrics listed in the survey are specifically technical measures that any decent vulnerability scanner produces - no surprises there given Tripwire's involvement.

More to follow: if you missed them, see the introduction and parts one, two, three and five of this series.

06 June 2013

Interesting metrics in the Novopay report

Here in New Zealand, a major government project to introduce Novopay, a new payroll system for schools, has made the news headlines repeatedly over many months, culminating in a media storm, the resignation of the accountable minister and independent investigations into the fiasco.

In my experience, the issues reported are far from unique to Novopay. I recognize the characteristics of many large software implementation or business change projects I have audited in the past, such as poor project governance, technical complexity arising from complex and probably inadequately specified user requirements, extensive customization of the core IT system, the introduction of new business and IT processes, and rapidly deteriorating commercial relations between the customer and supplier as the implementation went badly, to say the least. I'll leave the question of why these kinds of issue crop up over and over again for another rainy day. For now, suffice to say that the project has so far narrowly avoided becoming a train wreck.

On the up-side, the government has opened up about the issues, since the minister resigned at least, publishing quite detailed reports for instance and stating a number of recommendations to sort the mess out. The independent investigations took just three months, no mean feat in such a complicated and politically-charged situation. The new minister in charge, Steven Joyce, acknowledged the severity of the situation on prime-time TV just the other evening, and responded positively to robust questioning by a well-briefed journalist. An impressive, confident performance.

Anyway, I digress. The technical report is a good example of how metrics can be used to substantiate as well as illustrate high-level management reports.

Plucking out a single example, page 31 of Deloitte's Novopay Technical Review report says "As at 07 March 2013, 500 defects have been open for an average of 116 days (and this is relatively consistent across different severity levels)." This sentence includes the number of defects and the average number of days that defects take to close, two key metrics. The final parenthetical clause refers to additional data concerning the breakdown of defects by severity: one might expect that relatively serious defects would be fixed more quickly than trivia, but in practice serious defects tend to be inherently difficult to fix, especially as it appears as if the team has learnt its lesson and is now paying more attention to testing fixes before they are implemented.

Recommendation 2 of the Novopay Technical Review report concerns metrics. "Establish robust performance management processes and checkpoints to monitor remediation progress. To support achievement of the objectives, clear monitoring and performance measures need to be established and managed through a strong performance management process. This should include milestones for key activities and definition of stability from an operational metrics perspective (e.g. the number of no-pays, percentage of errors)." [page 41]. FWIW I recommend that the education ministry and Talent2, the contractors concerned, are PRAGMATIC about it. I'm waiting by the phone ...

PS The separate ministerial inquiry report also makes fascinating reading. There have clearly been serious issues in several areas, so it is no surprise that a number of senior people burnt out (I wonder if "burn out rate" qualifies as a useful metric on these huge politically-charged projects!).

One little paragraph that really caught my eye said "Work commenced on the service requirements in October 2008. This process was lengthy, and was never actually completed. Even after Go Live, new requirements were being discovered." That's a classic example of "Ready, Fire, Aim" if ever I heard one.

SMotW #60: proportion of security policies up to standard

Security Metric of the Week #60: proportion of security policies that meet documentation quality standards

There are two clear presumptions behind this metric: (1) there are 'documentation quality standards' which apply to the policies, and (2) security policies are being or will be assessed systematically against the standards. Both are signs of a relatively mature - and probably quite bureaucratic and well-structured organization. Measuring the quality of security policies according to how many of them meet the corporate standards probably seems like a natural thing to do ... but is it really? Or is it just needless red tape?

Most formal documentation quality standards specify requirements for formatting, layout and structure, since these are relatively straightforward to define, to implement and to audit against. They are basically style guides. The better ones go on to cover other far more subjective and frankly important matters such as readability. The very best talk about quality, purpose and utility from the perspective of the readers and other stakeholders, implying that they are best placed to determine the value of the policies. In that context, whether or not a security policy uses 12 point Tahoma is of little significance if the content is so obtuse and jargon-riddled, or if it essentially covers a non-issue, that nobody is going to pay it any attention anyway.

This, then, is an example of a bureaucratic metric, potentially of some operational value to the quality assurance function tasked with evaluating and reporting the quality of security policies but of little relevance beyond their domain. The PRAGMATIC ratings reflect our poor opinion of this metric as a measure to improve the governance, management and maturity of information security:

P	R	A	G	M	A	T	I	C	Score
66	47	79	45	74	38	44	50	35	53%

A better alternative might be to get the intended audiences and interested managers to assess and measure the quality of the security standards, and in so doing identify the characteristics of good vs bad policies. Slipping those characteristics into the corporate style guide, along with good and bad examples to push the point home, is more likely to achieve genuine progress, in our opinion, than measuring compliance with the documentation quality standards.