We didn't always. Originally, we used Thomson Reuters data, prior to July 1, 2012. At the time, our Research Director Marc Gerstein wrote the following piece to explain our position on different data providers.
Why Portfolio123 is Switching From Thomson Reuters Fundamental Data to Compustat
A Database-‐Comparison White Paper By Marc Gerstein, Research Director, Portfolio123 6/27/12
Effective 7/1/12, company fundamental data will be provided to us by Standard & Poor’s, Compustat and CapitalIQ (S&P). Up till now, we have been licensing this data from Thomson Reuters.
Compustat has been in business for more than 50 years and is the undisputed leader in providing fundamental data for use in the sort of modeling (screening and ranking with backtesting) we do, and their ‘Snapshot’ database is used by most academicians. We have carefully evaluated their offering and are confident that the change will be highly beneficial to users.
Here is a summary of the key differences you will notice:
- Compustat-‐Snapshot is the only “true point in time” database. Every line item has an “effective date”, meaning our testing will benefit from much greater accuracy in terms of re-‐creating historical
- Compustat-‐Snapshot is free of survivorship bias, both in terms of companies and securitie
- We will be testing on the basis of longer As of July 1, “Max” testing will commence on 1/2/99 rather than 3/1/01.
- We will be computing ratios (growth rates, debt ratios, ) in house, rather than relying on pre-‐ packaged ratios provided to us by a data provider. This allows us greater control and flexibility to create any new historical ratio.
- Our sector and industry classifications will use Standard & Poor’s widely-‐respected GICS schema including the full history of changes (for example IBM switching from hardware to service).
- Compustat, with its long heritage of and experience in serving users who do fundamental modeling offers standardized figures that are much more suitable for our
- There will be changes in the number of companies for which we have fundamental data. Our new default universe, entitled “All -‐ Fundamentals” contains about 6,400 stocks as of this writing . If you use technical analysis only you can switch to the “All Stocks” universe consisting of approximately 9,400
- The dropdown menu that had allowed you to choose to handle NA is no longer needed based on Compustat’s presentation. During pre-‐announcements when partial results are released we implemented and automatic fall-‐back system for ratios that evaluate to
We cannot predict whether specific models will produce better backtest results with the new data than you had been seeing. Sometimes they will. Sometimes they won’t. In the latter instances, take this as an opportunity to improve the models (based on more appropriate data inputs and testing protocols) in order to aim for better real-‐world performance.
We understand that in the short-‐term, status quo is the most comfortable course of action. But the opportunity to work with S&P and Compustat confers considerable long-‐term advantages on Portfolio123 users, as you will see as you become more familiar with the data.
As you are by now aware, Portfolio123 will be switching to Compustat fundamental data effective 7/1/12. While Thomson Reuters and Compustat are both institutional-‐quality databases (i.e. they appeal to sophisticated discriminating investors who are deeply concerned about data quality and are able and willing to pay up to get what they want), the offerings are very different. You will frequently see different numbers representing what you might expect to be the same item, and you can expect to see changes in the stocks that are included and excluded from your models. While status quo is always the most comfortable short-‐term option, we strongly believe that going forward, our subscribers will be best served by the switch to Compustat data.
A database is a database is a database – or not!
Many believe financial data, available via public filings (with the S.E.C.), is clear and objective, that for every item, there is one correct number, and that the relative quality of databases depends on the accuracy of transcription (i.e. if the company filing says a particular item is $943.2 million, a good-‐quality database will show $943.2 while a poor-‐quality database will show a different number.
The good news is that transcription is easy (easier than ever thanks to the SEC’s having pushed companies to use the automated XBRL protocol). While imperfections may still crop up from time to time, we can probably count, more now than ever, on all databases correctly showing $943.2.
The bad news is that most of what we use in our models does not consist of directly-‐recorded numbers. Instead, we typically work with “standardized” databases created by data vendors based on the reported numbers. We’ll see as we go on that the design and management of standardized databases involves a lot less science and a lot more art than many realize. Accordingly there’s considerable room for differences in numbers many might assume ought to be the same across all databases.
So as you work with Compustat, expect to encounter differences. While it may not be feasible for you to line Compustat up against Thomson Reuters to make direct comparisons (something we at Portfolio123 have been doing extensively), you can expect some stocks that didn’t previously make it into your models to now appear, and vice versa. (This applies only to models that use fundamentals; the price and volume data used for technical analysis is not impacted by Compustat fundamentals.)
Differences, per se, are no cause for alarm. They are inevitable regardless of which databases are compared. A switchover should be judged based, not based on the presence or absence of differences but on whether the new database is, from our point of view, better; more suitable for our model-‐driven investment efforts. The purpose of this document is to explain why we believe our answer for Compustat is a resounding “Yes.”
In fact, we’ve been concerned about key aspects of Thomson Reuters data for a long time, so much so that even if we were to stay with that firm, you’d still encounter substantial differences in the future in light of a multi-‐year project we’d undertaken for the purpose of replacing the ratios we’d been getting from Thomson Reuters, their “Ratios & Statistics” (RAS) file, with ratios that we created in house, often using different, sometimes starkly different, logic. When an opportunity to switch to Compustat arose recently, we decided, after much study, analysis, negotiation, etc. to make the move (i) because Compustat offers, out of the box so to speak, many of the things we’d been developing in-‐house, (ii) because Compustat’s approach to data-‐collection facilitates additional improvements beyond what we could accomplish by applying our logic to Thomson Reuters data, and (iii) because Compustat offers other valuable items not provided by Thomson Reuters, meaning we will have opportunities to further enhance our platform going forward.
This is a lengthy document containing detailed examples. You do not need to read further if you are comfortable with what has been discussed to this point about databases differing, and if you know of or are inclined to learn on your own of Compustat’s best-‐in-‐class stature among institutional investors and academicians who develop and test fundamental stock-‐selection models. (Note: Compustat’s orientation toward modeling is important: Its parent company, Standard & Poor’s, offers another fundamental database, Capital IQ, that is also best-‐in-‐class but we focused on Compustat because Capital IQ, as impressive as it is – and it is, indeed, impressive – is not as sharply focused on the sort of modeling we do. Capital IQ serves the world at large. Compustat focuses on users like us.) If you’d like a bit more background but not necessarily all the details, you can read through page seven and stop there, before we get into more detail on the standardization of accounting information.
Many of our users, however, are deeply aware of and interested in financial data (i.e. such as those who contact us, sometimes occasionally and sometimes more often, to say “You’re showing X for this number while so-‐and-‐so shows Y. What’s the reason for the difference? Who is right?” and would likely want a more detailed understanding of the differences between the two databases. These users will, hopefully, derive considerable benefit from reading the entire document.
Introducing the different approaches
The Thomson Reuters and Compustat databases both start with the same raw materials (the financial statements) and both adjust them in such way as to produce “standardized” statements.
Standardization is critical for us. It’s what allows us to be able to work with screens and ranking systems. Attempting to screen or rank based on data taken directly from financial statements would be like trying to run a car based on crude oil taken directly from the ground. It can’t be done. Refinement is necessary. That’s also true with use of data. Absent standardization, there can be no stock screening, no ranking, no Portfolio123, no other kinds of fundamental modeling.
Consider a simple example.
- Assume Company X, in its 10-‐K, reports line items labeled “Headquarters Salaries,” “Headquarters Occupancy Expense,” and “Advertising ”
- Assume Company Y does things differently and reports “General Administrative ”
- We’d like to create a screening rule limiting results to companies whose overhead is low as a percent of sales.
- So we might be tempted to do something like this: Overheard/Sales < .05
- The screen can’t evaluate company X or Company Y, because neither reports “Overhead.”
- We might instead have to do something like this:
- Overhead/Sales < .05 or Headquarters Salaries + Headquarters Occupancy Expense + Advertising Expense < .05 or General Administrative Expense < .05
- But that, too, may fail. Suppose company C uses the label “Selling Expense”
- As you can see, if we worked directly with financial data as reported by the companies, even the most simple screening rules would likely collapse under the weight of their own complexity
- Databases fix this by creating standardized financial statements. Headquarters Salaries, Headquarters Occupancy Expense, Advertising Expense, General Administrative Expense, Selling Expense, and countless other similar labels are included in a standardized item known as “Selling General and Administrative ”
- Now we can easily create our screening rule:
o Selling General and Administrative Expense / Sales < .05
Sometimes, it’s easy to standardize. Other times (many times, actually), it’s challenging. The databases differentiate from and compete with one another based on how they approach this task.
These differences generally are not arbitrary. More likely, they reflect the principles that motivated the initial creation of the databases. As you read on, bear in mind the different founding principles that apply to Thomson Reuters and Compustat.
What Makes Thomson Reuters Tick
Thomson Reuters uses the fundamental database it received when it acquired Multex in 2003. Multex obtained that database when it acquired Market Guide in 2000. The Market Guide database, built up during the 1990s, aimed to support the a general audience, but particularly the newly-‐emerging financial mass media (including the internet) and, accordingly, took pride in covering as many companies as possible and in its stature as a detailed “as reported” database. It aimed to give users (audiences) the complete, detailed set of line items exactly as reported by the companies, sparing them the burden of hunting down financial statements to see the real numbers. This was, after all, the new economy and the new economy was very much about convenience (skip the 10-‐K, surf to the web) and inclusiveness (there’s no such thing as a company not worthy of being considered: If the shares barely ever trade, no problem: someday they might trade more regularly. If there are no revenues, no problem: someday there may be some).
We’ll come back to inclusiveness and universe size later on. For now, let’s focus on the impact of as-‐ reported and standardization.
If an oil company reported a line item for “Drilling Expense,” Market Guide proudly showed prospective licensees that its presentation would show the “Drilling Expense” item. If an airline broke out “Fuel Costs” as a separate line item, Market Guide did likewise. And obviously, Market Guide would show whichever among the labels in our above example (Headquarters Salary Expense, etc.) a company chose to use.
As we saw though, as-‐reported data is limited when it comes to modeling based on numbers. As a result, Market Guide created a second “standardized” presentation in order to support such applications as stock screening (that’s the data source we’ve been using up till now).
Hence the Market Guide (and Multex and Thomson Reuters) data collection effort could not truly be considered complete simply by the copying and correctly labeling of Drilling Expense, etc. Each item also had to also be matched with a code that indicates how it would be classified in the standardized presentation. Presumably, Drilling Expense and Fuel Costs would be coded SCOR by Market Guide and later Multex and Thomson Reuters so that they would be reflected in a standardized line item labeled “Cost of Revenue, Total.”
How about an item labeled “Headquarters Salaries.”
- Market Guide (and now Thomson Reuters) has a line item called Selling, General/Administration Expenses, Total” (code: SSGA) and this item would definitely need to include Headquarters Salaries.
- But the SSGA item sits atop three sub-‐items: Selling, General/Administration Expenses (code: ESGA), Labor & Related Expense (code: ELAR), and Advertising Expense (code: EADV).
- Now, we get to the human element of data
- Some collectors will put Headquarters Expense in ELAR and repeat it in the overall SSGA line.
- Others might put it in ESGA and repeat it in
- Others might put it in SSGA
- And others might put it in ESGA or ELAR only and not bother to repeat it in SSGA. If the latter, hopefully and “editor” will catch it and send it back for completion, but that doesn’t always happen.
Does that seem confusing? Now you know why our ratio-‐recalculation project has been a multi-‐year effort. We have a lot of possibilities to consider! More importantly, recognize that such tasks were not at the core of Market Guide’s early branding. Its point of pride was elsewhere: in the number of companies under coverage and in its as-‐precise-‐as-‐possible reproduction of the company filings.
What Makes Compustat Tick
Let’s switch gears now and consider Compustat, a much older database (going back to the early 1960s).
When Compustat was born, there was no internet, nor was there any desire on the part of any other segment of mass media (which, typically, could care less about investing) to offer conveniently accessible reproductions of what companies reported. If one wanted to see the financial statements, one was expected to contact the company and ask to have reports sent to him or her. There was no role for financial databases – except for the very small group of people (geeks) within the investment community who wanted to use the era’s cool new gadgets (mainframe computers) to store and evaluate data in new ways that might provide more effective techniques for stock selection.
So unlike Market Guide, Compustat wasn’t seeking to promote precise electronic reproductions of company filings (those who wanted to see the actual reports were expected to get them from the company or a library).
Compustat, seeking to support the newly-‐emerging group of investors who pioneered the building and testing of computerized models based on fundamental principles saw standardization not as a necessary or burdensome add-‐on but as the sine qua non of its offering.
Also, although there were some go-‐go market periods during Compustat’s early days, not to mention promoters, touts and so forth (which are present in every era), the new economy wouldn’t yet be invented for a generation. Back in Compustat’s early days, there was no widespread burning desire on the part of investors sophisticated enough to do fundamental modeling to consider every so-‐called “company” that was able to figure out how to find the place where one had to file the certificate of incorporation or do an IPO. While Compustat’s customers were open to extremely small firms, they had some standards: They wanted businesses that actually existed and stocks that actually traded more than once in a blue moon. Hence Compustat was not necessarily marketed on the basis of how many thousands of stocks it covered.
The Competitive Battle Lines Are Drawn
So there we have it, two databases, two heritages, two brand identities, and two overarching characteristics: Thomson Reuters and the principles of largest possible universe and precise reproduction of what the companies presented, versus Compustat, a “value added” database designed to support computerized investment modeling.
If I were launching a general financial web site, say to compete with Yahoo! Finance, I’d probably favor a data-‐vendor with an as-‐reported heritage; Thomson Reuters, Capital IQ, EDGAR Online, etc. (and indeed, Yahoo! has over the years, gone back and forth among these providers apparently based on whichever provider offered more favorable licensing terms). Compustat, on the other hand, serves a niche market. But, and this is a big “But,” we at Portfolio123are at the heart of the niche Compustat targets.
So now, let’s look at some important (from our vantage point) differences and see why Compustat more effectively serves our needs.
For a long time, Portfolio123 users have become accustomed to working with a universe consisting of about 8,000-‐9,000 stocks. That will change. Going forward, we’ll still have a lot of stocks, even more than we do now. But many of these will be usable only by those who confine themselves to models based on technical-‐analysis. Our new “default” universe comprising companies for which we’ll have fundamental data (labeled “All-‐ Fundamentals” on the universe-‐election menu) will shrink – to a little more than 6,000 stocks at present. Compustat does not initiate coverage of firms with market capitalizations below $25 million and Tier 1 ADRs. (But once a stock enters the universe, it will stay there even if subsequent market conditions, or other problems, later push it below the threshold. So you’ll still encounter many extremely small, too small, companies. Don’t delete the liquidity rules you’ve been using!)
A Market Guide/Multex/Thomson Reuters sales representative would jump on universe size that as a major negative (as they had been doing for years). But is that really so for our needs?
As to the ADRs, the Tier 1 type trade in the less liquid over-‐the-‐counter markets and they do not file financial statements that are consistent with U.S. accounting rules, something mainstream ADRs regularly do.
As to the size issue, look at your models. Do you really draw from the entire 8,000-‐9,000 stock universe? More likely, you have some basic filters designed to weed out the least viable stocks at the outset. (This is likely to be so even for models based solely on technical analysis). I edit the Forbes Low-‐Priced Stock Report aimed stocks trading below $3, so my liquidity filters are probably much less stringent than those imposed by many Portfolio123 users and I do not see any handicap resulting from the Compustat inclusion rules. (For the most part, the stocks I monitor that have market caps below $25 million fell to that level after having been higher in the past and hence remain in the coverage universe. And as suggested above, I’m not going to delete my liquidity filters.)
Portfolio123 users can usually produce much more impressive performance tests when working with the entire 8,000-‐9,000-‐or-‐so stock universe. But we assume the main goal of our users is to be able to make real-‐world money applying those models. That can’t happen if the test results are influenced by the presence of stocks that cannot be traded in the real world on terms anywhere near those presumed by the pricing database. Bear in mind that a database will produce a price for a stock every day, however infrequently it actually trades. With illiquid stocks, these imputed prices (the ones upon which our tests are based) bear little if any resemblance to the prices you’d actually have to pay to buy the stock or the price you’d get if you tried to sell.
The $25 million market cap filter isn’t perfect. Actually, no filter can ever be perfect. (In connection with the Forbes Low-‐Priced Stock Report, I’ve often had to manually strike companies that passed my filters in terms of the letter of the law but failed the spirit of the law). But we believe it’s an imperfection we can tolerate in light of how marginal a burden it actually is, in light of how compatible it’s likely to be with the sub-‐universes already delineated by liquidity filters created by most users, and in light of the other benefits Compustat provides, to be discussed below.
In fact, as you read on and learn more about the differences between Thomson Reuters and Compustat and see some of the things that concern us about the former, it may seem fair to wonder if Thomson Reuters would have the wherewithal to do things differently if they weren’t burdened by the need allocate resources to the ongoing handling of data in connection with several thousand essentially untradeable stocks.
Differences in Standardization
There are many differences in the ways Compustat and Thomson Reuters approach standardization, too many to enumerate in full (actually, there’ll be differences between any two databases –even between Compustat and its corporate cousin Capital IQ). But the major differences can be grouped into several themes which will be illustrated below, starting with the simplest.
Theme 1: Basic Data Collection
We’ll start with a case study that examines a very simple but important metric: cash flow per share, which is net income plus depreciation and amortization, divided by the number of shares. Net income is reported in two places; on the income statement and on the statement of changes in cash position (the cash flow statement). Depreciation and amortization are occasionally reported in the income statement but more often than not, it’s best to get these figures from the cash flow statement, where they are more consistently reported.
Compustat and Thomson Reuters agree regarding the foregoing. Both compute cash flow by adding the figures taken from the cash flow statement. Accordingly, one would expect both data providers to agree on this number. That is usually the case. But there are times when it doesn’t work that way.
Consider 2011 cash flow per share for Coca Cola Enterprises (COKE). Compustat reports the figure as $10.45. Thomson Reuters reports it as having been $8.30. Yet both were looking as the same financial statement and applying the same logic.
The differences occurred in the computation of cash flow; before the per-‐share division. Here’s what happened.
The Cash Flow Statement reported the following relevant items:
Amortization of Intangibles
Amortization of Debt Costs
With Compustat, total cash flow is computed by adding 32,021 in net income to 64,448 in depreciation and amortization; the latter is computed as 61,686 + 432 + 2,330.
With Thomson Reuters, total cash flow is computed by adding 32,021 in net income to 62,118 in depreciation and amortization; the latter figure having been computed as 61,686 + 432.
Thomson Reuters ignored the 2,330 that was labeled Amortization of Debt Costs. That happened because of the label Coca Cola Enterprises chose for its cash flow statement. Had the company simply presented a 2,762 figure labeled Amortization (432 + 2,330), Compustat and Thomson Reuters would match. But the company thought it was doing investors a service by providing more detail on Amortization. Unfortunately, however, the Thomson Reuters data collection staff, trained to recognize “Amortization” and/or “Amortization of Intangibles,” “Amortization of Goodwill,” etc. did not grasp the fact that “Amortization of Debt Costs,” a non-‐standard label, referred to an item that should have been included in the overall Amortization and Cash Flow calculations.
Many more example like this can be found.
For another example, note that Thomson Reuters presents Joe’s Jeans (JOEZ) as a debt-‐free company. Compustat shows it as having short-‐term debt now, and as consistently having had short-‐term debt in the past.
This occurred because the Thomson Reuters data collection staff, familiar with such line items as “Current Portion of Long Term Debt,” “Debt Due Within One Year,” “Short-‐Term Debt,” “Notes Payable,” etc. did not recognize a non-‐standard label, “Due to factor” as a form of borrowing. Compustat, however, did recognize the item as one that had to be included in its standardized “Debt Included in Current Liabilities” field.
Interestingly, Thomson Reuters’ persistent failure to recognize the debt item occurred notwithstanding the fact that it does present interest expense (non-‐operating interest expense – see below for a separate discussion of this classification) quarter after quarter, year after year.
The data collection process includes “edits” designed to flag potential errors. I don’t know exactly what edits are used by Thomson Reuters, but one can wonder if the persistent presence of interest expense together with debt being persistently reported as zero ought to have triggered an edit. I’ll admit it’s not a slam dunk. Debt, as a balance sheet item, is reported for only four days out of the year, so it is possible for a company to have debt that is outstanding for much of the year, but which was temporarily paid down to zero on March 31, June 30, September 30 and December 31. This possibility could explain the Thomson Reuters failure to have flagged the odd presentation of Joe’s Jeans. Or it is possible the oddity was flagged but allowed to pass as is given a failure to recognize the meaning of “Due to factor.”
What we do know is that Compustat did a more effective job of recognizing the meaning of the unfamiliar label.
Recall, now, the introductory material, where I described standardization as not just being a science but of also as containing elements of art. Indeed, the art of standardization is no small matter: It’s substantial and it’s inevitable. Companies try as best they can to provide reasonable clarity on their operations. This is a good thing for those who look directly at the documents. But it does pretty much assure that there will be differences from one database to another as standardization, a process that relies heavily on human judgment, enters the picture.
If anything, the number of differences like this, although already considerable, may increase going forward as a result of the new XBRL (eXtensible Business Reporting Language) automated reporting protocol mandated by the SEC.
Some presume the switchover to automated reporting will make life easier for data-‐collection organizations. In terms of transcribing the as-‐reported number, that is true. But in terms of plugging the numbers into standardized products, the best-‐case scenario is no-‐change and more likely, XBRL may add to the challenges due to the first part of the acronym, the “X,” eXtensible. The quality of extensibility is designed to give companies the freedom to extend the language by coming up with their own classifications based on the characteristics of their business. As companies take more advantage of extensibility, the burden on data-‐providers will grow as they encounter more unfamiliar line items that must be assessed in terms of how they should relate to the standardized formats their clients are using.
All data providers, including Compustat, will at times make judgments that can be challenged. The human element is always present so we have no illusions that any data provider will ever be perfect. We understand that even in the best of circumstances, we may have occasion to contact Compustat to question some items. But our analysis to date indicates to us that Compustat, with its decades-‐long tradition with and experience in determining how to plug reported line items into standardized statements that will be used by investors for fundamental modeling and a more focused work-‐load (recall that it does not devote resources to input and maintain several thousand untradeable penny stocks) is much better positioned to outperform Thomson Reuters, with its heritage of as-‐reported data for as many companies as it can find.
Theme 2: Different ways to compute a particular ratio
Now, let’s consider a very basic example of how two databases can come up with different results for even the most seemingly simple ratios. We’ll use as an example, Gross Profit, which is typically defined as Sales minus Cost of Goods Sold.
As line items go, these tend to be easier to standardize, so Compustat and Thomson Reuters will often agree, or at least be very close, on the basic numbers. But when it comes to the calculation of Gross Profit, they’ll disagree almost 100% of the time! The variations will typically center around Cost of Goods Sold.
Thomson Reuters takes the Cost-‐of-‐Goods-‐Sold number from the income statements as filed by the companies or from items that can be easily recognized as such for standardization purposes. Compustat does likewise, but then adds an important adjustment.
Is that proper? Can one adjust numbers that are clearly identified in financial statements?
It depends on why one is working with the statements. We (users of Portfolio123 and StockScrteen123) work with financial statements for purposes of investment analysis (electronic bulk analysis, but analysis nonetheless) and this is often at odds with Generally Accepted Accounting Principles. According to Graham & Dodd:
A major activity of security analysis is the analysis of financial statements. This analysis includes two steps: First, the financial statements must be adjusted to reflect an analyst’s viewpoint, that is, the analyst changes the published numbers, eliminates some assets and liabilities, creates new ones, alters the allocation of expenses to time periods, and, in effect, creates a new set of financial statements. Second, the analyst processes the new information by the calculation of averages, ratios, trends, equations, and other statistical treatment.
Cottle, Murray, Block, Graham & Dodd’s Security Analysis, page 133 (emphasis supplied).
Some of the adjustments proposed in Graham & Dodd are quite dramatic and best left to human discretion applied to analysis of a single company. But there are worthwhile things that can be done at the database level and we, who in essence analyze in bulk (in connection with the models we build, test, and use) should appreciate the fact that Compustat does make some these adjustments.
Let’s look at how this impacts Cost of Goods Sold. As noted, Thomson Reuters, aiming at a general audience, supplies a number designed to reflect what is reported by the company. Compustat, aiming at equity investors, (1) determines whether the reported number includes depreciation, and if the answer is “yes,” as is usually the case nowadays, (2) subtracts depreciation from reported cost of goods sold.
Because of this adjustment (which, by the way, is the same as approach as the one I was taught when I was initially trained at Value Line), Compustat will show a lower cost-‐of-‐goods-‐sold expense item and, consequently, higher gross profit and gross margin.
Our goal, as users, is to wind up with gross profit numbers that reflect as closely as possible the most basic level of profit earned from the sale of a particular item. Depreciation is a more generalized expense (a non-‐cash accrual rather than a cash outlay) that is more broadly tied to the enterprise as a whole. That doesn’t diminish its importance, and we will make prominent use of it when we work with “operating profit” and “operating margin” (we’ll actually be using numbers Compustat refers to as “operating profit after depreciation”). But it really isn’t the sort of product-‐based expenditure that comports with what we should see in Cost of Goods Sold. Hence for us, the Compustat adjustment is preferable.
This is another instance where you cannot assume, when you see differences along these lines, that somebody is doing something wrong. The databases are deliberately choosing different approaches in order to serve different target customers. Different formulas for a particular item are commonplace.
Theme 3: Redefining Items Set Forth in Financial Statements
This theme will combine elements of the first two. It will involve determination of how reported items ought to be classified for purposes of standardization (Theme 1) and over-‐riding of perfectly-‐legal company choices (Theme 2).
Consider IBM’s interest expense for 2011. The income statement in the 10-‐K clearly shows this to be $411.0 million, and that is the figure we get from Thomson Reuters. Compustat, however, shows $420.0 million, a figure it calculated by adding back $9 million in capitalized interest which it picked up from the footnotes.
Capitalized interest refers generally to interest on debt incurred to finance the building or preparation of an asset (getting it ready to be capable of producing revenue). Companies are permitted to keep such interest payments off the income statement, add them to the asset’s cost, and depreciate them over a number of years.
Thomson Reuters noticed the $9 million capitalized-‐interest figure but opted to record the item in its standardized presentation so it could be seen by users, but to accept the company’s accounting definition of interest expense, which meant leaving it out of the income statement. Hence interest expense shown by Thomson Reuters matches the $411 figure shown in the 10-‐K.
Which way is correct? The issue is debatable. Here is Compustat’s explanation, as set forth in the Data Guide they provide for licensees:
Capitalized interest is added back to interest expense in order to more accurately calculate coverage and profitability measures. Since companies can use significant discretion in determining how much interest to capitalize, adding back this amount allows for consistent comparisons across companies.
This is not a silver bullet. Somebody could still disagree. Nevertheless a case could be made (one with which I agree) to the effect that Compustat’s approach more effectively serves the needs of users like us, who need consistent fundamental comparisons across a wide variety of companies. It also seems consistent with the Graham-‐Dodd position in favor of adjusting reported numbers where necessary to present a better picture of economic reality. (Suppose Company X shows $50 million of interest on the income statement and has another $50 million that is capitalized, and generates $65 million in EBIT. Notwithstanding that the income statement only shows $50 million in interest (ditto screens and ranking systems created using Thomson Reuters), creditors would be very nervous because in fact, company would be $35 million short of what it needs to service its debt! The financial-‐strength metrics we calculate from the data should reflect this. Compustat supports our need.)
Theme 4: Bold Alterations to the Structure of the Financial Statements
Now that we got our feet wet with some “simple” adjustments to reported items, let’s dive into the deep end and consider two areas that are likely to produce very large and persistent differences in what you see on Thomson Reuters and Compustat. More often than not, if you encounter dramatic differences in the results of your models (and, consequently, the results of the tests and simulations you run), the cause will relate to this theme.
I’ll warn you that the presentation of this theme is a long and it will involve two detailed examples. But if you’ve read to this point, I strongly recommend that you stick with it since it will goes a long way toward explaining the most significant differences you will see between Thomson Reuters and Compustat.
(a) Case Study: Unusual Income-‐Statement Items
Consider the following income presentation based on the 2011 10-‐K filing for Verenium (VRNM); in this example, all figures represent thousands of U.S. dollars:
Total Revenue 61,267
Cost of Product Revenue 34,281
Research and Development 10,986
Selling, general and administrative 19,365 Restructuring charges 2,943 Loss from Operations (6,508)
Interest and other income (expense) 56
Interest expense (3,062)
Gain (loss) on debt extinguishment 15,349 Gain (loss) of change in fair value of derivatives (964)
Net income (loss) before income taxes 4,871 Question: What should the data provider show for “operating profit?”
Compustat shows a smaller operating loss, one that amounts to (3,600). Interestingly, Thomson Reuters, despite its as-‐reported heritage, also varied from this particular company presentation, but it went in a different direction: It showed a substantial 8,840 operating profit!
Before going on and looking at how the different databases reached their respective conclusions, pause and review the above numbers. What do you, as an investor, think would be a reasonable answer?
Both data providers start with the same revenue figure: 61,267. But what expenses ought to be subtracted? Consider what operating profit is supposed to represent. It’s supposed to depict the result of the company’s plain-‐vanilla day-‐to-‐day business operations; sales (or revenues) minus basic costs incurred to run the business. Some such costs will be directly tied to product production. Others will not be pegged to individual units of production but should still be ordinary and necessary to the running of a business.
Assuming you’ve considered the matter and now have a general sense of what you think operating profit ought to be, let’s review how the standardized protocols created by Compustat and Thomson Reuters resolved the matter.
Here’s the Compustat computation:
Total Revenue 61,300
Cost of Product Revenue 33,100
Selling, general and administrative 30,400 Depreciation 1,300 Loss from Operations (3,600)
The decimal rounding is based on that which is shown on Compustat’s institutional web platform. Don’t worry about that. Focus on the big picture, bearing in mind that Thomson Reuters is going to show an 8,840 profit!
Consistent with an earlier thematic example, Compustat shows depreciation. The company didn’t do that on the income statement, but Compustat views it as an ordinary operating expense that ought to be shown anyway, and went into the footnotes to get it. It’s not as if Verenium omitted depreciation and overstated their income. They didn’t. They lumped Depreciation in with Cost of Product Revenue, as do most firms. So when Compustat created a separate line item for Depreciation, it avoided double counting by subtracting Depreciation from Cost of Product Revenue. OK. By now, none of this should surprise us.
What happened to Research & Development, which Verenium showed? Compustat’s standardization calls for it to be included in Selling, General & Administrative Expense (another approach that matches the way things were done at Value Lie when I worked there).
But here’s where the changes get bigger: All other expenses were deemed by Compustat to be non-‐ operating, regardless of how they were classified by Verenium. So to compute operating expense, Compustat started with Revenue and subtracted only Cost of Goods Sold, Depreciation, Selling, General and Administrative expense as originally reported, and Research & Development.
The Compustat operating loss is significantly smaller than the one reported by the company. That’s because Compustat, true to its heritage as a value-‐added data provider that aims to present numbers it deems most likely to be meaningful to investors, eliminated the 2,943 restructuring charge (Compustat includes it in a separate non-‐operating line item it identifies as “Total Unusual Items”).
Now, let’s take a look at what Thomson Reuters did (again, don’t worry about distortions caused by differing decimal rounding practices):
Total Revenue 61,270
Cost of Product Revenue
Selling, general and administrative
Research and Development
Other Unusual Expense (Income) (15,350) Operating Income 8,840
I’m not able to explain why the Thomson Reuters Cost of Product Revenue number varies from that presented by the company. I know it has nothing to do with Depreciation (Thomson Reuters followed the company’s practice of including it.) But in the context of the question before us, this is trivial (Compustat discloses many adjustments it makes to company-‐reported numbers; Thomson Reuters does little of this, but there seems to have been something here that did trigger such an adjustment on its part). Let’s focus on the big differences.
Notice first that Thomson Reuters follows the company practice of including the 2,940 Restructuring charge in operating profit. Interestingly, though, Thomson Reuters took the debt-‐extinguishment gain, which the company classified as non-‐operating, and moved it up into the operating area!
This seems odd, but it was not an accident, not by any means. More often than not, companies do put such items in the operating category, and true to its heritage as an “as reported” database Market Guide (which, as we know, became Thomson Reuters) established a standardization protocol that included two broad items among its operating expenses: “Unusual Expense (Income)” and “Other Operating Expenses, Total.”
Here are the sub-‐components of each.
Unusual Expense (Income) includes the following standardized line items: Purchased R&D Written Off
Restructuring charge Litigation
Impairment -‐ Assets Held for Use Impairment -‐ Assets Held for Sale
Loss (Gain) on Sale of Assets – Operating Other Unusual Expense (Income)
Other Operating Expenses, Total includes the following standardized line items: Foreign Currency Adjustment
Unrealized Losses (Gains)
Minimum Pension Liability Adjustment Other Operating Income
As we saw above, Thomson Reuters moved the debt-‐extinguishment gain into the Other Unusual Expense (Income) section of the Unusual Expense (Income) area.
But what about the 964 loss from the change in value of derivatives? Thomson Reuters left that among the non-‐operating expenses and put it in the standardized classification: Investment Income – Non Operating.
Frankly, as an investor, I’d prefer to eliminate non-recurring items from operating profit (and, hence from EBIT and EBITDA) whenever they can be identified. Compustat does that due to its heritage of offering data for fundamental investment modeling. Thomson Reuters includes non-‐operating items because it’s as-‐reported heritage calls for it to follow company reporting format as closely as feasible and whether relevant to investors or not. In fact, the as-‐reported heritage is so strong, Thomson Reuters in this instance went out of its way in its standardized presentation to put a non-recurring item into the operating category even though Verenium deviated from the norm and placed it into the non-operating area (in the case of the derivative gain illustrated here, Thomson Reuters overruled one company’s pro-investor classification to stay consistent with other companies that choose not to follow this approach).
Don’t for a moment assume I’m simply second-guessing the handling of Verenium. Think about what you, as an investor, as an investor who uses screening and ranking to help you identify stocks for purchase or sale, want to consider when you work with operating profit, EBIT or EBITDA. Do you really want to be saddled in this context with the items Thomson Reuters pulls in under the broad headings “Unusual Expense (Income)” and “Other Operating Expenses, Total?” Even if you are interested in these items (and for some investors, there may be good reason to be fascinated by them), aren’t you better off having them segregated into a clearly non-‐operating portion of the income statement so you could more thoughtfully model based on their presence or absence? Part of what I had been working on in connection with the Portfolio123 ratio re-‐calculation project has involved trying to cleanse operating profit of these analytically-‐inappropriate items (Business Income, a pre-defined custom formula I created is one aspect of this; going forward, it will become superfluous since Compustat will be giving u better operating profit numbers out of the box).
This is a frequently-occurring and very substantial issue. Operating profit (and margin) is important in and of itself. It’s also important because it directly impacts the computation of other important metrics, such as EBIT and EBITDA as well as valuation ratios based thereon (including increasingly popular valuation models that work with relationships between Enterprise Value and EBIT or EBITDA). Because EBIT and EBITDA were inflated by unusual items, many Thomson Reuters models picked this stock up as a value play. That did not happen with Compustat.
(b) Case Study: Interest Expense/Income
You’d think interest expense and income would be easy to classify. These clearly are non-‐operating items. Indeed, where it otherwise, there would be no point in any investor ever bothering to think about something like EBIT or EBITDA.
Not so fast! Let’s look at another real-‐life example. Here is an income presentation based on the 2011 10-‐K filing for Nucor (NUE):
Net Sales 20,023,564
Cost of products sold
Marketing, administrative and other expenses
Equity in losses of unconsolidated affiliates
Interest expense, net 166,094
Earnings (Loss) before income taxes 1,251,812
What is this company’s operating profit? No, this is not a trick question merely because the company chose not to specifically total up and identify an operating-‐profit number. Many firms do present an Operating Profit sub-‐total. Many others omit it. Companies are free to approach the matter as they and their accountants choose.
But we still need an answer.
Here’s the Compustat approach (as usual, don’t get hung up on decimal conventions): Total Revenue 20,023.6
Cost of Goods Sold 17,513.6
Selling, General & Administrative 503.7
Interest and Related Income
Other Non-‐Operating Inc. (Exp.)
Total Unusual Items 15.1 Pretax Income 1,251.8
Before discussing it, let’s look right away at the Thomson Reuters approach: Total Revenue 20,023.56
Cost of Revenue
Selling, General & Administrative
Interest Expense -‐ Operating
Interest Income – Operating
Investment Income – Operating 10.04 Operating Profit 1,251.81
Interest Expense, net Non-‐operating -‐ -‐
Interest Income, net Non-‐operating -‐ -‐ Other, Net - ‐ -‐ Pretax Income 1,251.81
Clearly, there’s more going on here than differing approaches to interest. We know Compustat subtracts depreciation from cost of goods sold and lists it separately. And there are differences in the way the data providers classify and net out interest expense, interest income and investment income. Some are easily apparent. Others aren’t and would require digging into footnotes. Both organizations did some work along these lines. But let’s focus, here, on the big metric: operating profit, a figure both providers need to create on their own given the company’s omission of this subtotal.
Compustat takes the numbers it sees in the 10-‐K income statement (with some footnote-‐based adjustments) and plugs them into the standard operating income categories it created (sales or revenue minus cost of goods sold minus selling general and administrative expenses minus depreciation).
Numbers that cannot be properly plugged into Compustat’s operating items are dropped down to the non-‐operating section of its standardized income statement. By reclassifying in this manner, it was able to derive an operating profit of 1,412.8. (Moving other items to the non-‐operating area enabled Compustat to reach the same pretax income figure, 1,251.8 as that reported by the company.
Thomson Reuters likewise varies from the company presentation in order to put the data into a standardized format. But because of its “as reported” heritage, Thomson Reuters is not willing to be impose its own operating-‐profit subtotal. Now here’s where it gets odd. Because this particular company did not clearly and specifically identify any expenses as having been non-‐operating, Thomson Reuters presumes all expenses are operating. (A business can exist without non-‐operating costs, but it cannot logically exist without some sort of operating costs. Hence the Thomson Reuters decision to presume the unlabeled expenses are operating.)
So where does that leave interest expense? That’s usually thought of as a non-‐operating expense, but as we see, Thomson Reuters won’t go there. Since Nucor didn’t clearly identify interest expense as having been non-‐operating, Thomson Reuters had to consider the interest expense to have been an operating expenditure.
But that’s not necessarily the end of the inquiry. Most companies do clearly place interest in the non-‐ operating area (as occurred in the first example we examined). So how can Thomson Reuters cope with a single item, interest expense, being non-‐operating most of the time, but on some occasions, operating?
The answer: Don’t show interest expense at all, not ever – not at any time and not for any company! Instead, Thomson Reuters created two different standardized fields: Operating Interest Expense, and Non-‐Operating Interest Expense. All interest-‐expense items must be placed into one category or the other. Since this particular company did not specifically place its interest expense into a non-‐operating section of the income statement, Thomson Reuters classifies it as having been operating interest expense. (Similar protocols are followed with interest and investment income).
Because Nucor did not identify any expenses as having been non-‐operating, Thomson Reuters treats all of them (including interest) as operating. Revenue minus all these expenses equals 1,251.81 and that’s what Thomson Reuters shows as operating profit. Since there are no non-‐operating operating expense, pretax income (operating profit minus non-‐operating expenses, which in this case sum to zero) also is 1,251.81. That means EBIT (Earnings Before Interest and Taxes) would not be 1,430.62 (1,251.81 + interest expense of 178.81), but 1,251.81. In other words, Thomson Reuters is, in effect, subtracting interest expense in order to compute Earnings Before interest Expense! And, of course, the important
interest coverage ratio would be reported as “NM,” which stands for No Meaningful Figure. These outcomes are clearly incorrect. To make the correct calculations, users are forced to over-‐ride Thomson Reuters’ standardized income presentation and combine operating and non-‐operating interest items (something now being done by some, including even product groups within Thomson Reuters itself that work apart from the data group!) but not all. Compustat, on the other hand, had it correct in all places from day one as have all of its users.
Theme 5: Non-‐standard Reporting Formats
If you look in any how-‐to book dealing with company analysis, chances are you’ll see a presentation that assumes the same sort of reporting format and addresses such matters as gross profit, working capital, depreciation, capital spending and so forth. Is that really what it’s all about? Consider carefully.
How would you describe working capital for a bank? Come to think of it, how would you even describe cost of goods sold? Above, we fussed considerably over the Thomson Reuters attempt to carve out a line item called operating interest expense. It doesn’t comport with reality – or at least not usually.
While interest expense is not an operating cost for a steel producer like Nucor, it most definitely can be an operating expense for a bank if we’re dealing with interest paid on deposits. And good luck scouring bank 10-‐Ks looking to identify working capital. That’s not a factor in the banking business.
Most widely-‐considered discussions of financial analysis are limited, to an extent often unappreciated by many investors, to industrial companies, companies that make things. By now, we’ve comfortably stretched that to include basic service-‐oriented businesses, such as retailing. But we’re still finding it hard to come to grips with companies in such areas as banking, real estate and insurance.
When Compustat got going in the early 1960s, the solution was to plug non-‐standard companies into the standard format as best as could be done. But as the years passed, databases got more sophisticated and vendors started creating distinctive reporting formats. Thomson Reuters, for example, has four: Industrial, Utility, Bank, and Insurance. Compustat surpassed that and offers Industrial, Bank, Broker, Insurance, Real Estate and Other Financial Service.
For referential purposes (the creation of a web page presenting financials for a specific company), this is terrific. Users can now see reasonably relevant financial-‐statement presentations even without having to find a web site that offers “as reported” financials.
But for our work, screening and ranking, multiple reporting formats do not work so well. Portfolio123 has “if-‐then” capabilities (the EVAL function) which enable venturesome users to essentially build four models in one based on industry classification. Unfortunately, though, this could make even the simplest screening rule or ranking factor horrendously complex by the time we’ve finished articulating the nested EVAL functions and coping with typo-‐hell to specify the correct sets of matching parentheses. So the typical approach is to build models based on the most widely used (industrial) reporting format and accept that non-‐industrials will be excluded from the results if any of the rules point to items that, for those companies, are “NA” (Not Available). If, for example, your screen contains a rule requiring Gross Margin to be above the industry average, don’t expect to see any banks. Because of “NA” gross margins, they’ll all be excluded no matter how many other tests they pass.
As of this writing (mid-‐2012), many might be tempted to think “Good riddance, those banks have been nothing but trouble anyway.” I urge you to resist such a temptation. If you want to bar banks from your results, you could just as easily do so with a rule stating that “Industry is not equal to Bank.” Relying instead on supposedly-‐agnostic fundamental factors to do this could hurt you just as easily as it could help you (actually, as of this writing, many banks, recovering from the depths of the financial crisis, have been seeing their shares rise briskly, meaning you’d be hard pressed to keep up with the market if your models were inadvertently but automatically disqualifying them).
As a result, if one wants to avoid inadvertent industry bias, one would normally have to make sure the model contained no rules or factors that would produce “NA” results based solely on reporting format. So, for example, if you are open minded regarding the presence or absence of Banks in your results, make sure your models don’t refer to gross margin.
Compustat tries to offer an alternative, based on its early efforts to plug all companies into a single format. Although Compustat has evolved beyond that, it did not wipe out the old protocols. That means we, and other licensees who use the data to build models, have a choice. We can opt to work with the newer industry-‐specific formats, or we stick with the old protocol, the one wherein all companies were plugged into a single format (i.e. for cost of goods sold, which is not reported by banks, Compustat chooses basic business expenditure items it deems most analogous to what we aim for when we think of traditional cost of goods sold). We chose the latter. That means Banks will not be automatically excluded from a model that includes a reference to something like Gross Margin. This isn’t perfect: Compustat does not impute current assets or current liabilities to banks so they’ll still be excluded by models that refer to current ratio, quick ratio or working capital. But by offering conventional industrial-‐format ratios where possible, we’ll lose fewer companies based on reporting format than we did in the past.
Now, does it seem a bit chintzy to try to plug specialized bank line items into a standard industrial format? Is it worth doing so in order to try to reduce the number of companies disqualified as “NA?”
It’s easy to see where one might suspect this to be the case. Actually, though, Compustat’s approach to the reclassification is anything but casual. Consider Macatawa Bank (MCBC) and its long-‐term debt for 2011. Here’s what we see in the 10-‐K:
Other Borrowed funds 148,603
Long-‐term debt 41,238
Subordinated debt 1,650
Accrued expenses and other liabilities 6,461 Total Liabilities 1,413,241
Banks are a bit tricky since much of what they “have” consists of money that can be considered to have been borrowed. This is so even for deposits. (That’s why the money they pay to their customers is referred to as “interest.”)
Fortunately, Macatawa looks easy. Thomson Reuters noticed one 41,238 line item labeled long-‐term debt and picked up that figure. They also recognized Subordinated debt, 1,650 in this case, as another item that should be included in Long-‐term Debt. They added the items and reported 42,900 (don’t worry about the rounding differences). Thomson Reuters ignored the “Other Borrowed funds” line item for reasons that will be noted below.
Compustat agrees with Thomson Reuters regarding the 41,238 and 1,650 figures. But as to the “Other Borrowed funds” item, it went into the footnotes, saw that these were Federal Home Loan Bank Advances, correctly determined that these are, indeed, debt obligations, and added all of them in except for a 36,781 chunk of money identified at the bottom of the footnote as being due within one year. It reports that amount as “Current Portion of LT Debt,” and reports long-‐term debt as 154,700.
The difference here is that Compustat considered FHLB Advances to be debt. Thomson Reuters did not. Who is right?
Thomson Reuters has one opinion. Compustat has another. But consider the following, from the FDIC (Federal Deposit Insurance Corp.) web site:
The Federal Home Loan Bank System (FHLBS) was chartered by Congress in 1932. It provides liquidity to member institutions that hold mortgages in their portfolios and facilitates the financing of home mortgages by making low-‐cost loans, called advances, to those members.
(Emphasis in original.)
This is not an instance of isolated word usage. Elsewhere on the site, the FDIC analyzes the increasing reliance of banks on this kind of capital given the diminution of their deposit bases and, the corresponding increase in debt-‐like risks that come with interest and maturity terms.
So it seems that Compustat has a highly credible and powerful ally supporting its view that FHLB Advances ought to be considered debt. The language of the quoted FDIC text (“provides liquidity”) sounds a lot more like short-‐term debt than long-‐term. But in the case of Macatawa Bank, the 10-‐K footnote clearly specified how much was due within a year and how much was subject to longer maturities, so Compustat was able to easily and reliably make the distinctions.
It’s not as if Thomson Reuters ignored the footnotes. It went there too and also noticed that “Other borrowed funds” consisted of FHLB Advances. In fact, in its standardized bank income statement, it has a line item called “FHLB Advances” and that’s where it lists the item.
Clearly, Thomson Reuters meant well. It gave users a good, properly detailed, presentation of the item. But that presentation doesn’t work for us. It harkens back to Thomson Reuters' heritage as an “as reported” database and would look great in an HTML report on Yahoo! Finance or a site like that.
But we who use the data for screening and ranking are not well served by Thomson Reuters’ decision to have taken what our models clearly need to pick up as a debt item and to have classified it as something different. Indeed, screens and ranking systems using Thomson Reuters data understate debt levels and risk for banks, especially the ones less able to get capital from depositors or the conventional capital markets and find it necessary to lean on the FHLB. Remember: The as-‐reported FHLB items will show up in an HTML presentation. But it won’t show as debt in screens based on Thomson Reuters, which work with the standardized long-‐term debt data fields. Hence users of Thomson Reuters data who care about debt ratios will have models that understate Macatawa’s long-‐term debt by about 70%!
Perhaps Compustat’s approach to plugging specialized reporting into the standard industrial formats (which harkens back to its heritage as of interpreting the financial statements and presenting metrics most likely to be useful to those who build fundamental models) isn’t so chintzy after all.
As noted at the outset, status quo is the most comfortable data option for us. But it is not an effective option. Data is important to the work we do. No matter how diligent we are, our models can never be better than the data upon which they are based.
We, in general, have done well up till now. Thomson Reuters has given us a lot of good content. But as we have seen here, there is considerable room for improvement, and we had been working to improve our situation re-‐calculating, where feasible, standardized items we receive from Thomson Reuters to better comport with our needs. But having an opportunity to go directly to Compustat, we believe this is the best course of action for our users.