Data collection is defined as the procedure of collecting, measuring, and analyzing accurate insights for research using standard validated techniques.
A researcher can evaluate their hypothesis on the basis of collected data. In most cases, data collection is the primary and most important step for research, irrespective of the field of research.
The approach of data collection is different for different fields of study, depending on the required information.
Importance of Data collection
- The integrity of the Research: A key reason for collecting data, be it through quantitative or qualitative methods is to ensure that the integrity of the research question is indeed maintained.
- Reduce the likelihood of errors: The correct use of an appropriate collection of methods reduces the likelihood of errors consistent with the results.
- Decision Making: To minimize the risk of errors in decision-making, it is important that accurate data is collected so that the researcher does not make uninformed decisions.
- Save cost and time: Data collection saves the researcher time and funds that would otherwise be misspent without a deeper understanding of the topic or subject matter.
Primary and Secondary Sources of Data
There are two sources of data:
- Primary Source
- Secondary Source
1. What is Primary Source of Data
You want to know about the quality of life of the people in your town. You may like to ascertain the quality of life in terms of per capita expenditure of different households in your town.
You decide to collect the basic data yourself through a statistical survey of course with the help of investigators or field of workers. while doing this exercise you are relying on the primary source of the data.
Thus, a primary source of data implies collection of data from its source of origin. It offers you first hand quantitative information relating to your statistical study.
You or your team of investigators are contacting the respondents and obtaining the desired quantities information on per capita expenditure of different household in your town.
2. What is Secondary Source of Data
Secondary source of Collection of data implies obtaining the relevant statistical information from an agency or an institution that is already in possession of that information.
To continue with the previous example, data relating to the quality of life of the people of your town maybe already been collected by the state government.
You can simply approach the government department and request the desired information. This will be a secondary source of data for you.
Thus, Secondary source implies that the desired statistical information already exist and you are simply to collect it from the concerned agency or the department.
Primary and Secondary Data
What is Primary Data
Data collected by the investigator for his own purpose for the first time from beginning to end are called primary data. These are collected from the source of origin.
The concerned investigator is the first person who collects this information. Primary data are original. The primary data therefore first-hand information.
Example, you may be interested in studying the socio-economic state of those student in your class 11th who secured first division in their matriculation examination. You collect information regarding their pocket allowance, their family income, etc.
All this information would be termed as primary information of primary data all these information would be termed as primary information or primary data, since you happen to be the first person to collect this information from the source of its Origin.
What is Secondary Data
Secondary data or those which are are already in existence and which have been collected for some other purpose than the answering of the question in hand.
Data are collected by other persons are called secondary data. These data are therefore called second-hand data.
Obviously, since these have been already been collected by somebody else these are available in the form of published or unpublished reports.
For example, data relating to Indian Railway which are annually published by the Railway Board would be secondary data for any researcher.
Difference Between Primary Data and Secondary Data
- Primary data definition says it to be the data that is collected for the first time by the user himself whereas Secondary data definition says it to be the type of data that is previously collected by others and later used by another.
- Primary data are mainly collected for a specific purpose and are involved in direct usage without any manipulation whereas Secondary data collected for multiple purposes as required by the user to derive various kinds of inferences from it after necessary manipulation.
- Primary data are collected via physical testing, observation, surveys, questionnaires, case studies, videos, eyewitnesses, personal interviews, etc. Whereas Secondary data are collected from published data by the state or central government, articles by local bodies, Census data, periodicals, etc.
- Primary data are collected by the user directly, so they are original and devoid of any kind of alteration, whereas Secondary data are collected by others for their usage, so it is not original.
- Primary data is in the form of raw materials, which need to be represented in proper ways to derive the necessary conclusion, whereas Secondary data is already collected and used for a specific purpose so they are obtained in a polished form.
- Primary data are collected for the first time directly by the user so it requires a lot of time, whereas the secondary data is already collected and can be used later, it does not demand time for collection when used later.
- Primary Data Collection is also quite expensive because the user has to conduct the surveys or questionnaires all by himself, whereas the secondary data is already present in a presentable form in the official website or magazines which can be used by the user to suffice his purpose. Hence, collecting them does not require extra expenses.
- Primary data original because these are collected by the investigator from the source of their origin, whereas secondary data are already in existence and therefore they are not original.