Difference between primary and secondary data
Following points distinguish primary and secondary data:
Image credits © Gaurav Akrani.
- Meaning, example, and definition,
- Data's originality,
- Need of adjustment,
- Data sources,
- Type of data,
- Methods used to collect data,
- Obtained data's reliability,
- The time consumed,
- Need of investigator,
- Cost effectiveness,
- When are the data collected?
- Capability to solve a problem,
- Suitability to meet the requirement,
- Bias or personal prejudice,
- Who collects the data? And
- Precaution before using the data.
Now let's compare primary and secondary data on the above sixteen points.
1. Meaning, example, and definition
Primary data are fresh (new) information collected for the first time by a researcher himself for a particular purpose. It is a unique, first-hand and qualitative information not published before. It is collected systematically from its place or source of origin by the researcher himself or his appointed agents. It is obtained initially as a result of research efforts taken by a researcher (and his team) with some objective in mind. It helps to solve certain problems concerned with any domain of choice or sphere of interest. Once it is used up for any required purpose, its original character is lost, and it turns into secondary data.
One must note that, even if the data is originally collected by somebody else from its source for his study, but never used then the collected data is called primary data. However, once used it turns into secondary data.
Imagine, you are visiting an unexplored cave to investigate and later recording its minute details to publish, is an example of primary data collection.
Wessel's definition of primary data,
“Data originally collected in the process of investigation are known as primary data.”
Secondary data, on the other hand, are information already collected by others or somebody else and later used by a researcher (or investigator) to answer their questions in hand. Hence, it is also called second-hand data. It is a ready-made, quantitative information obtained mostly from different published sources like companies' reports, statistics published by government, etc. Here the required information is extracted from already known works of others (e.g. Published by a subject scholar or an organization, government agency, etc.). It is readily available to a researcher at his desk or place of work.
Assume, you are preparing a brief report on your country's population for which you take reference of the census published by government, is an example of secondary data collection.
Sir Wessel, defined secondary data in simple words as,
“Data collected by other persons are called secondary data.”
Another definition of secondary data in words of M. M. Blair,
“Secondary data are those which are already in existence and collected for some other purpose than the answering of the question in hand.”
2. Data's originality
Primary data are collected by a researcher (or investigator) at the place or source of its origin. These are original or unique information.
A researcher (or investigator) does the collection of secondary data from already existing works of others. These are neither originals nor unique information.
3. Need of adjustment
The primary data collection is done to accomplish some fixed objective, and obtained with some focus in mind. Hence, it doesn't need any prior adjustment before getting used to satisfy the purpose of an inquiry.
Secondary data collected are truly the work of someone else done for some other purposes. It is not focused to meet the objective of the researcher. As a result, it needs to be properly adjusted and arranged before making its actual use. Only after proper adjustment, it can be accustomed to some extend for achieving the aim of a researcher.
4. Data sources
Primary data are collected systematically through following activities:
- By conducting surveys,
- Taking in-depth interviews of respondents (These are individuals who give necessary information to the interviewer),
- Through experimentation,
- By direct observations,
- Ethnographic research (It primarily involves the study of an ethnic group of people and their respective culture),
- Focus groups,
- Participatory research, etc.
The collection of secondary data is from internal and external published sources.
Internal sources of secondary data are:
- Company's accounts,
- Sales figures,
- Reports and records,
- Promotional campaigns' data,
- Customers' feedback,
- Cost information,
- Marketing activities, so on.
External sources of secondary data include:
- Data published by country's central, state and local governments,
- Data even published by foreign governments,
- Publications released by international organizations (like the IMF, WHO, ILO, UNO, WWF, etc.) and their subsidiary bodies,
- Reports prepared by various commissions and other appointed committees,
- Results of research work published by research institutions, universities, subject scholars, economists, etc.,
- Books, newspapers, and magazines,
- Reports and journals of trade unions, industries, and business associations,
- Information released by a central bank, stock exchanges, etc.,
- Public libraries,
- Archives, Directories, Databases, and Indexes,
- Old historical records,
- Online websites, blogs, and forums.
Note: Sometimes, though rarely, even unpublished information still available in office records can also be used for secondary data.
5. Type of data
Primary data provide qualitative data. It means it gives information on subjective quality-related features like look, feel, taste, lightness, heaviness, etc., of any object or phenomenon under research or inquiry.
On the contrary, secondary data, provide quantitative data. In other words, it gives information about an object or event in a numerical, statistical and tabulated form like in percentages, lists, tables, etc.
6. Methods used to collect data
Methods used to collect primary data are as follows:
- Observation, experimentation and interview method,
- The direct personal investigation,
- The indirect oral-investigation,
- Information collected through schedules and questionnaires (sets of questions) via enumerator's (a survey personnel involved in counting and listing) method and mailing method,
- Information obtained from correspondents or local sources,
- Some other minor methods:
- The analysis of the content,
- Consumer panels,
- Use of mechanical devices,
- Pantry audits,
- Distributor or store audits,
- Projective Techniques (PT),
- Warranty cards, etc.
The main methods used to collect secondary data are:
- Desk research methods,
- Search on the Internet,
- Going through media generated by consumers and their groups, so on.
7. Obtained data's reliability
Primary data are more reliable than secondary data. It is because primary data are collected by doing original research and not through secondary sources that may subject to some errors or discrepancies and may even contain out-dated information.
Secondary data are less reliable than primary data. It is so, since, based on research work done by others and not by the researcher himself. Here, verification of published information cannot be always confirmed accurately as all references used may not be available or mentioned in detail.
8. The time consumed
Reliability of primary data comes at the expense of time it consumes. It is because its collection goes through the following steps:
- First, the researcher makes a sample (i.e. List of respondents to approach).
- Then he prepares a questionnaire (i.e. Containing a set of questions to be asked to respondents).
- Later, he appoints and trains a team of field interviewers who are supposed to interview the respondents.
- Finally, the researcher has to analyze the collected data by interviewers and draw a conclusion from it.
Accomplishment of the above procedure is not a quick task, is a time-consuming one.
On the contrary, collection of secondary data consumes less time compared to primary data. It is because secondary data collection is mostly made without interviews as follows:
- Here, a researcher relies heavily on ready-made data and collects it from internal and external published sources (see the point no.4).
- He depends on already analyzed and concluded data by someone else to get an understanding of his subject topic or research interest.
- He doesn't waste time appointing field interviewers and waiting for their data.
He saves his precious work hours, and, as a result, it takes him less time to collect secondary data.
9. Need of investigator
Collection of primary data needs availability of trained researchers or investigators. Further, they also need to be adequately supervised and controlled.
If the availability of trained investigators and cost involved in hiring them is a problem, then in such a case, secondary methods of data collection are recommended. Its data collection doesn't need to hire them.
10. Cost effectiveness
Primary data collection needs the appointment of a team that mainly comprises of researchers, field interviewers, data analysts, so on. Hiring of these experts and other additional costs, demands more funds to be allocated to complete research work on time. For this reason, it is a costly affair.
The secondary data collection doesn't require the appointment of such a team. Here, since no experts hired, cost is minimized. As a result, it is very economical.
11. When are the data collected?
Collection of primary data starts when secondary data seems insufficient to solve problems associated with the research. The researcher first uses secondary data, if he finds that collected information from secondary sources, is inadequate, only then decides to collect primary data.
The secondary data collection is the priority and economical choice for most researchers to solve an identified problem or answer objects of inquiry. Here, most information extraction is done and if some information is unavailable only then a decision to conduct primary research is taken.
12. Capability to solve a problem
Primary data are fresh (new), original (unique), more accurate (almost correct), verified (confirmed), satisfies a requirement (as needed), up-to-date and current (latest). It gives the required information. For this reason, it is more capable of solving a problem.
Secondary data, on the other hand, may be less accurate or riddled with errors or discrepancies, not directly related (inconsistent) and even outdated (not latest). It gives only supporting and not the required information. As a result, it is less capable of solving a problem.
13. Suitability to meet the requirement
Primary data are suitable to meet the objects of inquiry because these are collected using systematic methods.
Collection of secondary data may or may not fulfill the actual requirement of a researcher.
14. Bias or personal prejudice
There is a possibility of personal prejudice or bias creeping in while collecting primary data because of the direct involvement of an investigator.
The possibility of prejudice is absent in secondary data because the information is not collected at first hand and, for this reason, is not subjected to any bias.
15. Who collects the data?
A researcher (an investigator) or his appointed agents collect the primary data.
Anyone, other than those who gather primary data collects secondary data.
16. Precaution before using the data
The primary data collection is done systematically by a researcher himself or his agents as instructed with great care, requirement, planning, organization and followed by verification of the obtained information. It is less likely that such a well-processed data is subject to errors.
For this reason, no extra precautions are necessary while using primary data.
On the other hand, secondary data, since collected by others for different purposes may be inconsistent (not as required), outdated, unverified, subjected to any errors or mistakes, etc. As a result, immense care must be taken while one is considering using it. If used without precaution, it may have an adverse impact on the quality of one's research and affect its credibility to a great extent.
We can conclude that any data remain data, whether termed as a primary or secondary. What classifies it from one another is the degree of detachment from its source and how it is being collected (whether as first-hand or second-hand) and used.
Any data become primary if it is first gathered by collecting agency, and the same data becomes secondary if it is used later by the rest of the world.
For example, data collected by an election commission are primary for it, and the same set of data is secondary for all except it.
Thus, Secrist lucidly describes this as follows,
“The distinction between primary and secondary data is one of the degrees. Data primary in the hands of one party may be secondary in the hands of others.”
References used and suggested reading for deeper understanding:
- Research Methodology: Methods and Techniques; by C. R. Kothari.
- Research Methodology: Data Presentation; by Dr. Y. K. Singh.
- Research Methodology: by Dr. C. Rajendar Kumar.
- Research Methodology and Statistical Analysis (for M. Com); by S.C. Aggarwal and S.K. Khurana.
- Statistics for Economics and Indian economic development; For Class 11; by T. R. Jain and V. K. Ohri.
- Statistics for Economics; Class 11; by Dr. D. P. Jain.
- International Business; 4th Edition; by Les Dlabay, James Scott.
- Marketing Research: An Applied Orientation; 5th Edition; By Naresh K. Malhotra and Satya Bhushan Dash.
- Marketing Research: Methodological Foundations; 10th Edition; by Gilbert A. Churchill, Jr and Dawn Iacobucci.
- Office Organization and Management; 2nd Edition; by S. P. Arora.