Charity Tax Returns
The Charity Tax Return database is composed of historical data the Canada Revenue Agency (CRA) shared with the Investigative Journalism Foundation (IJF) as well as more recent data scraped from the CRA website. The database includes charity tax data from 1990 to the present day.
Charity data disclosure
Under the Income Tax Act, registered charities in Canada are legally required to file an information return annually. Most of the information from these returns is publicly available, except certain confidential information.
The return must be filed within six months after the end of the charity’s fiscal year. Each charity has their own fiscal year end date. A complete information return includes form T3010 Registered Charity Information Return, a copy of the charity’s own financial statements, Form T1235, Directors/Trustees and Like Officials Worksheet, and if applicable, Form T1236, Qualified Donees Worksheet / Amounts Provided to Other Organizations and Form T2081, Excess Corporate Holdings Worksheet for Private Foundations.
Data collection
The IJF received tax return data from the CRA for all Canadian charities dating from January 1990 to June 2021. However, there exists a several month delay between when the CRA uploads tax return data on its website and when they provide this data for download. To obtain this data in real time, the IJF team gets recent tax return data from scraping the List of Charities web page, using the “Advanced Search” option from June 2021 onwards.
The following sections from each charity’s “Full view” page are scraped.
- Basic information
- Section A: Identification
- Section B: Directors/Trustees and Like Officials
- Section C: Programs and general information
- Section D - Financial information
- Schedule 1 - Foundations
- Schedule 2 - Activities outside Canada
- Schedule 3 - Compensation
- Schedule 5 - Non-cash gifts
- Schedule 6 - Detailed financial information
- Form T1236 - Qualified donees worksheet / Amounts provided to other organizations
Data cleaning
There are hundreds of fields in a standard T3010 tax return form, all which correspond to a line number (e.g. line 4700 in the 2021 tax return form is “total revenue”). Each line number on the form corresponds to a column in our charities database. The T3010 return has changed multiple times since 1990, and the numbers and definitions associated with these fields have also changed over time. For example, “total assets” is line 126 from 1990 to 1997, line 58 from 1997 to 2002, and line 4200 from 2003 to present day.
As the CRA data provided to us only included line numbers, the IJF team built a schema that matched each line number to its correct definition in its corresponding year. In addition to some line numbers changing definition between years, others stopped being used by the CRA after a certain year. For example, line 128 represented “Amounts payable to founders, officers, directors, members, organizations related to such persons” from 1990 to 1996 and “total disbursements” from 1997 to 2002. However, line 128 does not exist after 2002.
After mapping out column names by year, the IJF team selected about 250 columns to appear on the Charity Tax Return webpage out of the total 600+ fields. The 250-odd columns include the fields with the same or similar meaning (e.g. total revenue) which are represented by different line numbers depending on the year, and exist in our database as separate columns (e.g. line 109, line 118, line 4700 are all total revenue columns). Your typical charity tax return results page has about 30 rows, depending on the year.
The selection of columns includes the main financial categories in the T3010 form, which are revenue, expenditures, assets, gifts and liabilities; as well as basic information about the charity, including its business number, location and category. Some of the fields excluded from the Tax Return webpage include more specific information such as details on non-cash gifts received by the charity (e.g. line 500 “Artwork/wine/jewellery”, line 505 “Building materials”).
The IJF team also cleaned the data for standardization purposes, including lowercasing a lot of text that was originally all capitalized. Where possible, we also simplified and shortened column names to make them more comprehensible for users (e.g. line 4250 “Amount included in lines 4150, 4155, 4160, 4165 and 4170 not used in charitable activities” was changed to “Assets not used in charitable activities”).
Limitations
The Charity Tax Returns database uses data from the CRA. Charities may have made errors when filling out their returns, which would be captured in our database. Additionally, the IJF made editorial choices during the data cleaning process, which included deleting duplicate columns, renaming column names and calculating which columns add to total columns, such as total revenue or total expenditures.
The IJF combined columns for clarity and concision (see the data cleaning section above). For example, we turned Line 4250 “Amount included in lines 4150, 4155, 4160, 4165 and 4170 not used in charitable activities” into “Assets not used in charitable activities”. Those types of edits reduce the specificity that’s available in the CRA’s original data.
Charity Staff Compensation
The Charity Staff Compensation database includes information on how many staff charities have, the salary ranges for the highest compensated staff and how much charities spend on compensation in total.
Data collection
This database visualizes the Compensation section of the T3010 form and was created using the same data received from the CRA and scraped by the IJF, for the period of January 1990 to present.
The name of the section, line numbers, number of staff and compensation ranges vary by year. From 2009 to present day, “Schedule 3: Compensation” contains the compensation information, while previously the section was called “Section D: Remuneration from 1995 to 1997”; “Section F: Remuneration and Benefits from 1998 to 2002” and “Section D: Compensation from 2003 to 2008.”
From 2009 to present, the form required charities to disclose the compensation range for the top 10 most highly compensated full-time staff, whereas previously only the top five were required.
Data cleaning
Similar to the larger tax return database, the IJF team matched each line number from the compensation section to its corresponding definition and simplified and shortened names for simplicity. (e.g. Line 370 “Enter the number of part-time or part-year (for example, seasonal) employees the charity employed during the fiscal period” was changed to “Number of part-time staff”)
Limitations
The Charity Staff Compensation database uses data from the CRA. Charities may have made errors when submitting their information to the CRA and those will be captured in our database.
Additionally, the IJF made editorial choices during the data cleaning process, which included deleting duplicate columns and renaming column names, with the goal of combining data across years in as understandable a manner as possible.
Gifts Received by Charities
Each row in the Gifts Received by Charities database represents a grant or charitable donation received by a charity from another charity or foundation. The database contains the recipient of the gift, the donor, the donor’s designation (private foundation, public foundation, charitable organization), the amount of the donation and the year the donation was reported.
Data collection
The Gifts Received by Charities database was created using data from the CRA. Specifically, we used data from each charity’s Form T1236, Qualified donees worksheet / Amounts provided to other organizations, which they file with their annual T3010 tax return.
Data cleaning
Similar to the larger Charity Tax Return database, the IJF team simplified and shortened field names. (e.g. “503 total amounts given to qualified donees (add lines 501 and 502)” was changed to “Total amounts given to qualified donees”) We also cleaned the data for standardization purposes, including title-casing text like the donee names and donee city, that were originally all capitalized.
Limitations
The Gifts Received by Charities database uses the raw data received from the CRA and scraped from its website. Charities may have made errors when filling out their returns, which would be captured in our database. For example, each charity has a nine digit registration number. Some charities reported nine-digit grants that were exactly the same as their registration numbers. At the same time, those charities reported total annual revenue well below a billion dollars. In those cases, the IJF deleted that grant data. Additionally, in all cases where charities reported donations in excess of a billion dollars (11 rows), those charities reported less than a million dollars in revenues and assets. Because of this, we omitted this data.
Additionally, the IJF made editorial choices during the data cleaning process, which included deleting duplicate columns and renaming column names.