case study method data collection

The Ultimate Guide to Qualitative Research - Part 1: The Basics

case study method data collection

  • Introduction and overview
  • What is qualitative research?
  • What is qualitative data?
  • Examples of qualitative data
  • Qualitative vs. quantitative research
  • Mixed methods
  • Qualitative research preparation
  • Theoretical perspective
  • Theoretical framework
  • Literature reviews

Research question

  • Conceptual framework
  • Conceptual vs. theoretical framework

Data collection

  • Qualitative research methods
  • Focus groups
  • Observational research

What is a case study?

Applications for case study research, what is a good case study, process of case study design, benefits and limitations of case studies.

  • Ethnographical research
  • Ethical considerations
  • Confidentiality and privacy
  • Power dynamics
  • Reflexivity

Case studies

Case studies are essential to qualitative research , offering a lens through which researchers can investigate complex phenomena within their real-life contexts. This chapter explores the concept, purpose, applications, examples, and types of case studies and provides guidance on how to conduct case study research effectively.

case study method data collection

Whereas quantitative methods look at phenomena at scale, case study research looks at a concept or phenomenon in considerable detail. While analyzing a single case can help understand one perspective regarding the object of research inquiry, analyzing multiple cases can help obtain a more holistic sense of the topic or issue. Let's provide a basic definition of a case study, then explore its characteristics and role in the qualitative research process.

Definition of a case study

A case study in qualitative research is a strategy of inquiry that involves an in-depth investigation of a phenomenon within its real-world context. It provides researchers with the opportunity to acquire an in-depth understanding of intricate details that might not be as apparent or accessible through other methods of research. The specific case or cases being studied can be a single person, group, or organization – demarcating what constitutes a relevant case worth studying depends on the researcher and their research question .

Among qualitative research methods , a case study relies on multiple sources of evidence, such as documents, artifacts, interviews , or observations , to present a complete and nuanced understanding of the phenomenon under investigation. The objective is to illuminate the readers' understanding of the phenomenon beyond its abstract statistical or theoretical explanations.

Characteristics of case studies

Case studies typically possess a number of distinct characteristics that set them apart from other research methods. These characteristics include a focus on holistic description and explanation, flexibility in the design and data collection methods, reliance on multiple sources of evidence, and emphasis on the context in which the phenomenon occurs.

Furthermore, case studies can often involve a longitudinal examination of the case, meaning they study the case over a period of time. These characteristics allow case studies to yield comprehensive, in-depth, and richly contextualized insights about the phenomenon of interest.

The role of case studies in research

Case studies hold a unique position in the broader landscape of research methods aimed at theory development. They are instrumental when the primary research interest is to gain an intensive, detailed understanding of a phenomenon in its real-life context.

In addition, case studies can serve different purposes within research - they can be used for exploratory, descriptive, or explanatory purposes, depending on the research question and objectives. This flexibility and depth make case studies a valuable tool in the toolkit of qualitative researchers.

Remember, a well-conducted case study can offer a rich, insightful contribution to both academic and practical knowledge through theory development or theory verification, thus enhancing our understanding of complex phenomena in their real-world contexts.

What is the purpose of a case study?

Case study research aims for a more comprehensive understanding of phenomena, requiring various research methods to gather information for qualitative analysis . Ultimately, a case study can allow the researcher to gain insight into a particular object of inquiry and develop a theoretical framework relevant to the research inquiry.

Why use case studies in qualitative research?

Using case studies as a research strategy depends mainly on the nature of the research question and the researcher's access to the data.

Conducting case study research provides a level of detail and contextual richness that other research methods might not offer. They are beneficial when there's a need to understand complex social phenomena within their natural contexts.

The explanatory, exploratory, and descriptive roles of case studies

Case studies can take on various roles depending on the research objectives. They can be exploratory when the research aims to discover new phenomena or define new research questions; they are descriptive when the objective is to depict a phenomenon within its context in a detailed manner; and they can be explanatory if the goal is to understand specific relationships within the studied context. Thus, the versatility of case studies allows researchers to approach their topic from different angles, offering multiple ways to uncover and interpret the data .

The impact of case studies on knowledge development

Case studies play a significant role in knowledge development across various disciplines. Analysis of cases provides an avenue for researchers to explore phenomena within their context based on the collected data.

case study method data collection

This can result in the production of rich, practical insights that can be instrumental in both theory-building and practice. Case studies allow researchers to delve into the intricacies and complexities of real-life situations, uncovering insights that might otherwise remain hidden.

Types of case studies

In qualitative research , a case study is not a one-size-fits-all approach. Depending on the nature of the research question and the specific objectives of the study, researchers might choose to use different types of case studies. These types differ in their focus, methodology, and the level of detail they provide about the phenomenon under investigation.

Understanding these types is crucial for selecting the most appropriate approach for your research project and effectively achieving your research goals. Let's briefly look at the main types of case studies.

Exploratory case studies

Exploratory case studies are typically conducted to develop a theory or framework around an understudied phenomenon. They can also serve as a precursor to a larger-scale research project. Exploratory case studies are useful when a researcher wants to identify the key issues or questions which can spur more extensive study or be used to develop propositions for further research. These case studies are characterized by flexibility, allowing researchers to explore various aspects of a phenomenon as they emerge, which can also form the foundation for subsequent studies.

Descriptive case studies

Descriptive case studies aim to provide a complete and accurate representation of a phenomenon or event within its context. These case studies are often based on an established theoretical framework, which guides how data is collected and analyzed. The researcher is concerned with describing the phenomenon in detail, as it occurs naturally, without trying to influence or manipulate it.

Explanatory case studies

Explanatory case studies are focused on explanation - they seek to clarify how or why certain phenomena occur. Often used in complex, real-life situations, they can be particularly valuable in clarifying causal relationships among concepts and understanding the interplay between different factors within a specific context.

case study method data collection

Intrinsic, instrumental, and collective case studies

These three categories of case studies focus on the nature and purpose of the study. An intrinsic case study is conducted when a researcher has an inherent interest in the case itself. Instrumental case studies are employed when the case is used to provide insight into a particular issue or phenomenon. A collective case study, on the other hand, involves studying multiple cases simultaneously to investigate some general phenomena.

Each type of case study serves a different purpose and has its own strengths and challenges. The selection of the type should be guided by the research question and objectives, as well as the context and constraints of the research.

The flexibility, depth, and contextual richness offered by case studies make this approach an excellent research method for various fields of study. They enable researchers to investigate real-world phenomena within their specific contexts, capturing nuances that other research methods might miss. Across numerous fields, case studies provide valuable insights into complex issues.

Critical information systems research

Case studies provide a detailed understanding of the role and impact of information systems in different contexts. They offer a platform to explore how information systems are designed, implemented, and used and how they interact with various social, economic, and political factors. Case studies in this field often focus on examining the intricate relationship between technology, organizational processes, and user behavior, helping to uncover insights that can inform better system design and implementation.

Health research

Health research is another field where case studies are highly valuable. They offer a way to explore patient experiences, healthcare delivery processes, and the impact of various interventions in a real-world context.

case study method data collection

Case studies can provide a deep understanding of a patient's journey, giving insights into the intricacies of disease progression, treatment effects, and the psychosocial aspects of health and illness.

Asthma research studies

Specifically within medical research, studies on asthma often employ case studies to explore the individual and environmental factors that influence asthma development, management, and outcomes. A case study can provide rich, detailed data about individual patients' experiences, from the triggers and symptoms they experience to the effectiveness of various management strategies. This can be crucial for developing patient-centered asthma care approaches.

Other fields

Apart from the fields mentioned, case studies are also extensively used in business and management research, education research, and political sciences, among many others. They provide an opportunity to delve into the intricacies of real-world situations, allowing for a comprehensive understanding of various phenomena.

Case studies, with their depth and contextual focus, offer unique insights across these varied fields. They allow researchers to illuminate the complexities of real-life situations, contributing to both theory and practice.

case study method data collection

Whatever field you're in, ATLAS.ti puts your data to work for you

Download a free trial of ATLAS.ti to turn your data into insights.

Understanding the key elements of case study design is crucial for conducting rigorous and impactful case study research. A well-structured design guides the researcher through the process, ensuring that the study is methodologically sound and its findings are reliable and valid. The main elements of case study design include the research question , propositions, units of analysis, and the logic linking the data to the propositions.

The research question is the foundation of any research study. A good research question guides the direction of the study and informs the selection of the case, the methods of collecting data, and the analysis techniques. A well-formulated research question in case study research is typically clear, focused, and complex enough to merit further detailed examination of the relevant case(s).

Propositions

Propositions, though not necessary in every case study, provide a direction by stating what we might expect to find in the data collected. They guide how data is collected and analyzed by helping researchers focus on specific aspects of the case. They are particularly important in explanatory case studies, which seek to understand the relationships among concepts within the studied phenomenon.

Units of analysis

The unit of analysis refers to the case, or the main entity or entities that are being analyzed in the study. In case study research, the unit of analysis can be an individual, a group, an organization, a decision, an event, or even a time period. It's crucial to clearly define the unit of analysis, as it shapes the qualitative data analysis process by allowing the researcher to analyze a particular case and synthesize analysis across multiple case studies to draw conclusions.

Argumentation

This refers to the inferential model that allows researchers to draw conclusions from the data. The researcher needs to ensure that there is a clear link between the data, the propositions (if any), and the conclusions drawn. This argumentation is what enables the researcher to make valid and credible inferences about the phenomenon under study.

Understanding and carefully considering these elements in the design phase of a case study can significantly enhance the quality of the research. It can help ensure that the study is methodologically sound and its findings contribute meaningful insights about the case.

Ready to jumpstart your research with ATLAS.ti?

Conceptualize your research project with our intuitive data analysis interface. Download a free trial today.

Conducting a case study involves several steps, from defining the research question and selecting the case to collecting and analyzing data . This section outlines these key stages, providing a practical guide on how to conduct case study research.

Defining the research question

The first step in case study research is defining a clear, focused research question. This question should guide the entire research process, from case selection to analysis. It's crucial to ensure that the research question is suitable for a case study approach. Typically, such questions are exploratory or descriptive in nature and focus on understanding a phenomenon within its real-life context.

Selecting and defining the case

The selection of the case should be based on the research question and the objectives of the study. It involves choosing a unique example or a set of examples that provide rich, in-depth data about the phenomenon under investigation. After selecting the case, it's crucial to define it clearly, setting the boundaries of the case, including the time period and the specific context.

Previous research can help guide the case study design. When considering a case study, an example of a case could be taken from previous case study research and used to define cases in a new research inquiry. Considering recently published examples can help understand how to select and define cases effectively.

Developing a detailed case study protocol

A case study protocol outlines the procedures and general rules to be followed during the case study. This includes the data collection methods to be used, the sources of data, and the procedures for analysis. Having a detailed case study protocol ensures consistency and reliability in the study.

The protocol should also consider how to work with the people involved in the research context to grant the research team access to collecting data. As mentioned in previous sections of this guide, establishing rapport is an essential component of qualitative research as it shapes the overall potential for collecting and analyzing data.

Collecting data

Gathering data in case study research often involves multiple sources of evidence, including documents, archival records, interviews, observations, and physical artifacts. This allows for a comprehensive understanding of the case. The process for gathering data should be systematic and carefully documented to ensure the reliability and validity of the study.

Analyzing and interpreting data

The next step is analyzing the data. This involves organizing the data , categorizing it into themes or patterns , and interpreting these patterns to answer the research question. The analysis might also involve comparing the findings with prior research or theoretical propositions.

Writing the case study report

The final step is writing the case study report . This should provide a detailed description of the case, the data, the analysis process, and the findings. The report should be clear, organized, and carefully written to ensure that the reader can understand the case and the conclusions drawn from it.

Each of these steps is crucial in ensuring that the case study research is rigorous, reliable, and provides valuable insights about the case.

The type, depth, and quality of data in your study can significantly influence the validity and utility of the study. In case study research, data is usually collected from multiple sources to provide a comprehensive and nuanced understanding of the case. This section will outline the various methods of collecting data used in case study research and discuss considerations for ensuring the quality of the data.

Interviews are a common method of gathering data in case study research. They can provide rich, in-depth data about the perspectives, experiences, and interpretations of the individuals involved in the case. Interviews can be structured , semi-structured , or unstructured , depending on the research question and the degree of flexibility needed.

Observations

Observations involve the researcher observing the case in its natural setting, providing first-hand information about the case and its context. Observations can provide data that might not be revealed in interviews or documents, such as non-verbal cues or contextual information.

Documents and artifacts

Documents and archival records provide a valuable source of data in case study research. They can include reports, letters, memos, meeting minutes, email correspondence, and various public and private documents related to the case.

case study method data collection

These records can provide historical context, corroborate evidence from other sources, and offer insights into the case that might not be apparent from interviews or observations.

Physical artifacts refer to any physical evidence related to the case, such as tools, products, or physical environments. These artifacts can provide tangible insights into the case, complementing the data gathered from other sources.

Ensuring the quality of data collection

Determining the quality of data in case study research requires careful planning and execution. It's crucial to ensure that the data is reliable, accurate, and relevant to the research question. This involves selecting appropriate methods of collecting data, properly training interviewers or observers, and systematically recording and storing the data. It also includes considering ethical issues related to collecting and handling data, such as obtaining informed consent and ensuring the privacy and confidentiality of the participants.

Data analysis

Analyzing case study research involves making sense of the rich, detailed data to answer the research question. This process can be challenging due to the volume and complexity of case study data. However, a systematic and rigorous approach to analysis can ensure that the findings are credible and meaningful. This section outlines the main steps and considerations in analyzing data in case study research.

Organizing the data

The first step in the analysis is organizing the data. This involves sorting the data into manageable sections, often according to the data source or the theme. This step can also involve transcribing interviews, digitizing physical artifacts, or organizing observational data.

Categorizing and coding the data

Once the data is organized, the next step is to categorize or code the data. This involves identifying common themes, patterns, or concepts in the data and assigning codes to relevant data segments. Coding can be done manually or with the help of software tools, and in either case, qualitative analysis software can greatly facilitate the entire coding process. Coding helps to reduce the data to a set of themes or categories that can be more easily analyzed.

Identifying patterns and themes

After coding the data, the researcher looks for patterns or themes in the coded data. This involves comparing and contrasting the codes and looking for relationships or patterns among them. The identified patterns and themes should help answer the research question.

Interpreting the data

Once patterns and themes have been identified, the next step is to interpret these findings. This involves explaining what the patterns or themes mean in the context of the research question and the case. This interpretation should be grounded in the data, but it can also involve drawing on theoretical concepts or prior research.

Verification of the data

The last step in the analysis is verification. This involves checking the accuracy and consistency of the analysis process and confirming that the findings are supported by the data. This can involve re-checking the original data, checking the consistency of codes, or seeking feedback from research participants or peers.

Like any research method , case study research has its strengths and limitations. Researchers must be aware of these, as they can influence the design, conduct, and interpretation of the study.

Understanding the strengths and limitations of case study research can also guide researchers in deciding whether this approach is suitable for their research question . This section outlines some of the key strengths and limitations of case study research.

Benefits include the following:

  • Rich, detailed data: One of the main strengths of case study research is that it can generate rich, detailed data about the case. This can provide a deep understanding of the case and its context, which can be valuable in exploring complex phenomena.
  • Flexibility: Case study research is flexible in terms of design , data collection , and analysis . A sufficient degree of flexibility allows the researcher to adapt the study according to the case and the emerging findings.
  • Real-world context: Case study research involves studying the case in its real-world context, which can provide valuable insights into the interplay between the case and its context.
  • Multiple sources of evidence: Case study research often involves collecting data from multiple sources , which can enhance the robustness and validity of the findings.

On the other hand, researchers should consider the following limitations:

  • Generalizability: A common criticism of case study research is that its findings might not be generalizable to other cases due to the specificity and uniqueness of each case.
  • Time and resource intensive: Case study research can be time and resource intensive due to the depth of the investigation and the amount of collected data.
  • Complexity of analysis: The rich, detailed data generated in case study research can make analyzing the data challenging.
  • Subjectivity: Given the nature of case study research, there may be a higher degree of subjectivity in interpreting the data , so researchers need to reflect on this and transparently convey to audiences how the research was conducted.

Being aware of these strengths and limitations can help researchers design and conduct case study research effectively and interpret and report the findings appropriately.

case study method data collection

Ready to analyze your data with ATLAS.ti?

See how our intuitive software can draw key insights from your data with a free trial today.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Data Collection Methods | Step-by-Step Guide & Examples

Data Collection Methods | Step-by-Step Guide & Examples

Published on 4 May 2022 by Pritha Bhandari .

Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental, or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem .

While methods and aims may differ between fields, the overall process of data collection remains largely the same. Before you begin collecting data, you need to consider:

  • The  aim of the research
  • The type of data that you will collect
  • The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Table of contents

Step 1: define the aim of your research, step 2: choose your data collection method, step 3: plan your data collection procedures, step 4: collect the data, frequently asked questions about data collection.

Before you start the process of data collection, you need to identify exactly what you want to achieve. You can start by writing a problem statement : what is the practical or scientific issue that you want to address, and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find out. Depending on your research questions, you might need to collect quantitative or qualitative data :

  • Quantitative data is expressed in numbers and graphs and is analysed through statistical methods .
  • Qualitative data is expressed in words and analysed through interpretations and categorisations.

If your aim is to test a hypothesis , measure something precisely, or gain large-scale statistical insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or gain detailed insights into a specific context, collect qualitative data.

If you have several aims, you can use a mixed methods approach that collects both types of data.

  • Your first aim is to assess whether there are significant differences in perceptions of managers across different departments and office locations.
  • Your second aim is to gather meaningful feedback from employees to explore new ideas for how managers can improve.

Prevent plagiarism, run a free check.

Based on the data you want to collect, decide which method is best suited for your research.

  • Experimental research is primarily a quantitative method.
  • Interviews , focus groups , and ethnographies are qualitative methods.
  • Surveys , observations, archival research, and secondary data collection can be quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer your research questions.

When you know which method(s) you are using, you need to plan exactly how you will implement them. What procedures will you follow to make accurate observations or measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will take; if you’re conducting an experiment, make decisions about your experimental design .

Operationalisation

Sometimes your variables can be measured directly: for example, you can collect data on the average age of employees simply by asking for dates of birth. However, often you’ll be interested in collecting data on more abstract concepts or variables that can’t be directly observed.

Operationalisation means turning abstract conceptual ideas into measurable observations. When planning how you will collect data, you need to translate the conceptual definition of what you want to study into the operational definition of what you will actually measure.

  • You ask managers to rate their own leadership skills on 5-point scales assessing the ability to delegate, decisiveness, and dependability.
  • You ask their direct employees to provide anonymous feedback on the managers regarding the same topics.

You may need to develop a sampling plan to obtain data systematically. This involves defining a population , the group you want to draw conclusions about, and a sample, the group you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements for your study. To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and time frame of the data collection.

Standardising procedures

If multiple researchers are involved, write a detailed manual to standardise data collection procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research team collects data in a consistent way – for example, by conducting experiments under the same conditions and using objective criteria to record and categorise observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organise and store your data.

  • If you are collecting data from people, you will likely need to anonymise and safeguard the data to prevent leaks of sensitive information (e.g. names or identity numbers).
  • If you are collecting data via interviews or pencil-and-paper formats, you will need to perform transcriptions or data entry in systematic ways to minimise distortion.
  • You can prevent loss of data by having an organisation system that is routinely backed up.

Finally, you can implement your chosen methods to measure or observe the variables you are interested in.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales from 1 to 5. The data produced is numerical and can be statistically analysed for averages and patterns.

To ensure that high-quality data is recorded in a systematic way, here are some best practices:

  • Record all relevant information as and when you obtain data. For example, note down whether or how lab equipment is recalibrated during an experimental study.
  • Double-check manual data entry for errors.
  • If you collect quantitative data, you can assess the reliability and validity to get an indication of your data quality.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organisations.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g., understanding the needs of your consumers or user testing your website).
  • You can control and standardise the process for high reliability and validity (e.g., choosing appropriate measurements and sampling methods ).

However, there are also some drawbacks: data collection can be time-consuming, labour-intensive, and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to test a hypothesis by systematically collecting and analysing data, while qualitative methods allow you to explore ideas and experiences in depth.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research , you also have to consider the internal and external validity of your experiment.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, May 04). Data Collection Methods | Step-by-Step Guide & Examples. Scribbr. Retrieved 22 April 2024, from https://www.scribbr.co.uk/research-methods/data-collection-guide/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, qualitative vs quantitative research | examples & methods, triangulation in research | guide, types, examples, what is a conceptual framework | tips & examples.

Statistics Tutorial

  • Statistics Tutorial
  • Adjusted R-Squared
  • Analysis of Variance
  • Arithmetic Mean
  • Arithmetic Median
  • Arithmetic Mode
  • Arithmetic Range
  • Best Point Estimation
  • Beta Distribution
  • Binomial Distribution
  • Black-Scholes model
  • Central limit theorem
  • Chebyshev's Theorem
  • Chi-squared Distribution
  • Chi Squared table
  • Circular Permutation
  • Cluster sampling
  • Cohen's kappa coefficient
  • Combination
  • Combination with replacement
  • Comparing plots
  • Continuous Uniform Distribution
  • Continuous Series Arithmetic Mean
  • Continuous Series Arithmetic Median
  • Continuous Series Arithmetic Mode
  • Cumulative Frequency
  • Co-efficient of Variation
  • Correlation Co-efficient
  • Cumulative plots
  • Cumulative Poisson Distribution
  • Data collection
  • Data collection - Questionaire Designing
  • Data collection - Observation
  • Data collection - Case Study Method
  • Data Patterns
  • Deciles Statistics
  • Discrete Series Arithmetic Mean
  • Discrete Series Arithmetic Median
  • Discrete Series Arithmetic Mode
  • Exponential distribution
  • F distribution
  • F Test Table
  • Frequency Distribution
  • Gamma Distribution
  • Geometric Mean
  • Geometric Probability Distribution
  • Goodness of Fit
  • Gumbel Distribution
  • Harmonic Mean
  • Harmonic Number
  • Harmonic Resonance Frequency
  • Hypergeometric Distribution
  • Hypothesis testing
  • Individual Series Arithmetic Mean
  • Individual Series Arithmetic Median
  • Individual Series Arithmetic Mode
  • Interval Estimation
  • Inverse Gamma Distribution
  • Kolmogorov Smirnov Test
  • Laplace Distribution
  • Linear regression
  • Log Gamma Distribution
  • Logistic Regression
  • Mcnemar Test
  • Mean Deviation
  • Means Difference
  • Multinomial Distribution
  • Negative Binomial Distribution
  • Normal Distribution
  • Odd and Even Permutation
  • One Proportion Z Test
  • Outlier Function
  • Permutation
  • Permutation with Replacement
  • Poisson Distribution
  • Pooled Variance (r)
  • Power Calculator
  • Probability
  • Probability Additive Theorem
  • Probability Multiplecative Theorem
  • Probability Bayes Theorem
  • Probability Density Function
  • Process Sigma
  • Quadratic Regression Equation
  • Qualitative Data Vs Quantitative Data
  • Quartile Deviation
  • Range Rule of Thumb
  • Rayleigh Distribution
  • Regression Intercept Confidence Interval
  • Relative Standard Deviation
  • Reliability Coefficient
  • Required Sample Size
  • Residual analysis
  • Residual sum of squares
  • Root Mean Square
  • Sample planning
  • Sampling methods
  • Scatterplots
  • Shannon Wiener Diversity Index
  • Signal to Noise Ratio
  • Simple random sampling
  • Standard Deviation
  • Standard Error ( SE )
  • Standard normal table
  • Statistical Significance
  • Statistics Formulas
  • Statistics Notation
  • Stem and Leaf Plot
  • Stratified sampling
  • Student T Test
  • Sum of Square
  • T-Distribution Table
  • Ti 83 Exponential Regression
  • Transformations
  • Trimmed Mean
  • Type I & II Error
  • Venn Diagram
  • Weak Law of Large Numbers
  • Statistics Useful Resources
  • Statistics - Discussion
  • Selected Reading
  • UPSC IAS Exams Notes
  • Developer's Best Practices
  • Questions and Answers
  • Effective Resume Writing
  • HR Interview Questions
  • Computer Glossary

Statistics - Data collection - Case Study Method

Case study research is a qualitative research method that is used to examine contemporary real-life situations and apply the findings of the case to the problem under study. Case studies involve a detailed contextual analysis of a limited number of events or conditions and their relationships. It provides the basis for the application of ideas and extension of methods. It helps a researcher to understand a complex issue or object and add strength to what is already known through previous research.

STEPS OF CASE STUDY METHOD

In order to ensure objectivity and clarity, a researcher should adopt a methodical approach to case studies research. The following steps can be followed:

Identify and define the research questions - The researcher starts with establishing the focus of the study by identifying the research object and the problem surrounding it. The research object would be a person, a program, an event or an entity.

Select the cases - In this step the researcher decides on the number of cases to choose (single or multiple), the type of cases to choose (unique or typical) and the approach to collect, store and analyze the data. This is the design phase of the case study method.

Collect the data - The researcher now collects the data with the objective of gathering multiple sources of evidence with reference to the problem under study. This evidence is stored comprehensively and systematically in a format that can be referenced and sorted easily so that converging lines of inquiry and patterns can be uncovered.

Evaluate and analyze the data - In this step the researcher makes use of varied methods to analyze qualitative as well as quantitative data. The data is categorized, tabulated and cross checked to address the initial propositions or purpose of the study. Graphic techniques like placing information into arrays, creating matrices of categories, creating flow charts etc. are used to help the investigators to approach the data from different ways and thus avoid making premature conclusions. Multiple investigators may also be used to examine the data so that a wide variety of insights to the available data can be developed.

Presentation of Results - The results are presented in a manner that allows the reader to evaluate the findings in the light of the evidence presented in the report. The results are corroborated with sufficient evidence showing that all aspects of the problem have been adequately explored. The newer insights gained and the conflicting propositions that have emerged are suitably highlighted in the report.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sage Choice

Logo of sageopen

Continuing to enhance the quality of case study methodology in health services research

Shannon l. sibbald.

1 Faculty of Health Sciences, Western University, London, Ontario, Canada.

2 Department of Family Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

3 The Schulich Interfaculty Program in Public Health, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.

Stefan Paciocco

Meghan fournie, rachelle van asseldonk, tiffany scurr.

Case study methodology has grown in popularity within Health Services Research (HSR). However, its use and merit as a methodology are frequently criticized due to its flexible approach and inconsistent application. Nevertheless, case study methodology is well suited to HSR because it can track and examine complex relationships, contexts, and systems as they evolve. Applied appropriately, it can help generate information on how multiple forms of knowledge come together to inform decision-making within healthcare contexts. In this article, we aim to demystify case study methodology by outlining its philosophical underpinnings and three foundational approaches. We provide literature-based guidance to decision-makers, policy-makers, and health leaders on how to engage in and critically appraise case study design. We advocate that researchers work in collaboration with health leaders to detail their research process with an aim of strengthening the validity and integrity of case study for its continued and advanced use in HSR.

Introduction

The popularity of case study research methodology in Health Services Research (HSR) has grown over the past 40 years. 1 This may be attributed to a shift towards the use of implementation research and a newfound appreciation of contextual factors affecting the uptake of evidence-based interventions within diverse settings. 2 Incorporating context-specific information on the delivery and implementation of programs can increase the likelihood of success. 3 , 4 Case study methodology is particularly well suited for implementation research in health services because it can provide insight into the nuances of diverse contexts. 5 , 6 In 1999, Yin 7 published a paper on how to enhance the quality of case study in HSR, which was foundational for the emergence of case study in this field. Yin 7 maintains case study is an appropriate methodology in HSR because health systems are constantly evolving, and the multiple affiliations and diverse motivations are difficult to track and understand with traditional linear methodologies.

Despite its increased popularity, there is debate whether a case study is a methodology (ie, a principle or process that guides research) or a method (ie, a tool to answer research questions). Some criticize case study for its high level of flexibility, perceiving it as less rigorous, and maintain that it generates inadequate results. 8 Others have noted issues with quality and consistency in how case studies are conducted and reported. 9 Reporting is often varied and inconsistent, using a mix of approaches such as case reports, case findings, and/or case study. Authors sometimes use incongruent methods of data collection and analysis or use the case study as a default when other methodologies do not fit. 9 , 10 Despite these criticisms, case study methodology is becoming more common as a viable approach for HSR. 11 An abundance of articles and textbooks are available to guide researchers through case study research, including field-specific resources for business, 12 , 13 nursing, 14 and family medicine. 15 However, there remains confusion and a lack of clarity on the key tenets of case study methodology.

Several common philosophical underpinnings have contributed to the development of case study research 1 which has led to different approaches to planning, data collection, and analysis. This presents challenges in assessing quality and rigour for researchers conducting case studies and stakeholders reading results.

This article discusses the various approaches and philosophical underpinnings to case study methodology. Our goal is to explain it in a way that provides guidance for decision-makers, policy-makers, and health leaders on how to understand, critically appraise, and engage in case study research and design, as such guidance is largely absent in the literature. This article is by no means exhaustive or authoritative. Instead, we aim to provide guidance and encourage dialogue around case study methodology, facilitating critical thinking around the variety of approaches and ways quality and rigour can be bolstered for its use within HSR.

Purpose of case study methodology

Case study methodology is often used to develop an in-depth, holistic understanding of a specific phenomenon within a specified context. 11 It focuses on studying one or multiple cases over time and uses an in-depth analysis of multiple information sources. 16 , 17 It is ideal for situations including, but not limited to, exploring under-researched and real-life phenomena, 18 especially when the contexts are complex and the researcher has little control over the phenomena. 19 , 20 Case studies can be useful when researchers want to understand how interventions are implemented in different contexts, and how context shapes the phenomenon of interest.

In addition to demonstrating coherency with the type of questions case study is suited to answer, there are four key tenets to case study methodologies: (1) be transparent in the paradigmatic and theoretical perspectives influencing study design; (2) clearly define the case and phenomenon of interest; (3) clearly define and justify the type of case study design; and (4) use multiple data collection sources and analysis methods to present the findings in ways that are consistent with the methodology and the study’s paradigmatic base. 9 , 16 The goal is to appropriately match the methods to empirical questions and issues and not to universally advocate any single approach for all problems. 21

Approaches to case study methodology

Three authors propose distinct foundational approaches to case study methodology positioned within different paradigms: Yin, 19 , 22 Stake, 5 , 23 and Merriam 24 , 25 ( Table 1 ). Yin is strongly post-positivist whereas Stake and Merriam are grounded in a constructivist paradigm. Researchers should locate their research within a paradigm that explains the philosophies guiding their research 26 and adhere to the underlying paradigmatic assumptions and key tenets of the appropriate author’s methodology. This will enhance the consistency and coherency of the methods and findings. However, researchers often do not report their paradigmatic position, nor do they adhere to one approach. 9 Although deliberately blending methodologies may be defensible and methodologically appropriate, more often it is done in an ad hoc and haphazard way, without consideration for limitations.

Cross-analysis of three case study approaches, adapted from Yazan 2015

The post-positive paradigm postulates there is one reality that can be objectively described and understood by “bracketing” oneself from the research to remove prejudice or bias. 27 Yin focuses on general explanation and prediction, emphasizing the formulation of propositions, akin to hypothesis testing. This approach is best suited for structured and objective data collection 9 , 11 and is often used for mixed-method studies.

Constructivism assumes that the phenomenon of interest is constructed and influenced by local contexts, including the interaction between researchers, individuals, and their environment. 27 It acknowledges multiple interpretations of reality 24 constructed within the context by the researcher and participants which are unlikely to be replicated, should either change. 5 , 20 Stake and Merriam’s constructivist approaches emphasize a story-like rendering of a problem and an iterative process of constructing the case study. 7 This stance values researcher reflexivity and transparency, 28 acknowledging how researchers’ experiences and disciplinary lenses influence their assumptions and beliefs about the nature of the phenomenon and development of the findings.

Defining a case

A key tenet of case study methodology often underemphasized in literature is the importance of defining the case and phenomenon. Researches should clearly describe the case with sufficient detail to allow readers to fully understand the setting and context and determine applicability. Trying to answer a question that is too broad often leads to an unclear definition of the case and phenomenon. 20 Cases should therefore be bound by time and place to ensure rigor and feasibility. 6

Yin 22 defines a case as “a contemporary phenomenon within its real-life context,” (p13) which may contain a single unit of analysis, including individuals, programs, corporations, or clinics 29 (holistic), or be broken into sub-units of analysis, such as projects, meetings, roles, or locations within the case (embedded). 30 Merriam 24 and Stake 5 similarly define a case as a single unit studied within a bounded system. Stake 5 , 23 suggests bounding cases by contexts and experiences where the phenomenon of interest can be a program, process, or experience. However, the line between the case and phenomenon can become muddy. For guidance, Stake 5 , 23 describes the case as the noun or entity and the phenomenon of interest as the verb, functioning, or activity of the case.

Designing the case study approach

Yin’s approach to a case study is rooted in a formal proposition or theory which guides the case and is used to test the outcome. 1 Stake 5 advocates for a flexible design and explicitly states that data collection and analysis may commence at any point. Merriam’s 24 approach blends both Yin and Stake’s, allowing the necessary flexibility in data collection and analysis to meet the needs.

Yin 30 proposed three types of case study approaches—descriptive, explanatory, and exploratory. Each can be designed around single or multiple cases, creating six basic case study methodologies. Descriptive studies provide a rich description of the phenomenon within its context, which can be helpful in developing theories. To test a theory or determine cause and effect relationships, researchers can use an explanatory design. An exploratory model is typically used in the pilot-test phase to develop propositions (eg, Sibbald et al. 31 used this approach to explore interprofessional network complexity). Despite having distinct characteristics, the boundaries between case study types are flexible with significant overlap. 30 Each has five key components: (1) research question; (2) proposition; (3) unit of analysis; (4) logical linking that connects the theory with proposition; and (5) criteria for analyzing findings.

Contrary to Yin, Stake 5 believes the research process cannot be planned in its entirety because research evolves as it is performed. Consequently, researchers can adjust the design of their methods even after data collection has begun. Stake 5 classifies case studies into three categories: intrinsic, instrumental, and collective/multiple. Intrinsic case studies focus on gaining a better understanding of the case. These are often undertaken when the researcher has an interest in a specific case. Instrumental case study is used when the case itself is not of the utmost importance, and the issue or phenomenon (ie, the research question) being explored becomes the focus instead (eg, Paciocco 32 used an instrumental case study to evaluate the implementation of a chronic disease management program). 5 Collective designs are rooted in an instrumental case study and include multiple cases to gain an in-depth understanding of the complexity and particularity of a phenomenon across diverse contexts. 5 , 23 In collective designs, studying similarities and differences between the cases allows the phenomenon to be understood more intimately (for examples of this in the field, see van Zelm et al. 33 and Burrows et al. 34 In addition, Sibbald et al. 35 present an example where a cross-case analysis method is used to compare instrumental cases).

Merriam’s approach is flexible (similar to Stake) as well as stepwise and linear (similar to Yin). She advocates for conducting a literature review before designing the study to better understand the theoretical underpinnings. 24 , 25 Unlike Stake or Yin, Merriam proposes a step-by-step guide for researchers to design a case study. These steps include performing a literature review, creating a theoretical framework, identifying the problem, creating and refining the research question(s), and selecting a study sample that fits the question(s). 24 , 25 , 36

Data collection and analysis

Using multiple data collection methods is a key characteristic of all case study methodology; it enhances the credibility of the findings by allowing different facets and views of the phenomenon to be explored. 23 Common methods include interviews, focus groups, observation, and document analysis. 5 , 37 By seeking patterns within and across data sources, a thick description of the case can be generated to support a greater understanding and interpretation of the whole phenomenon. 5 , 17 , 20 , 23 This technique is called triangulation and is used to explore cases with greater accuracy. 5 Although Stake 5 maintains case study is most often used in qualitative research, Yin 17 supports a mix of both quantitative and qualitative methods to triangulate data. This deliberate convergence of data sources (or mixed methods) allows researchers to find greater depth in their analysis and develop converging lines of inquiry. For example, case studies evaluating interventions commonly use qualitative interviews to describe the implementation process, barriers, and facilitators paired with a quantitative survey of comparative outcomes and effectiveness. 33 , 38 , 39

Yin 30 describes analysis as dependent on the chosen approach, whether it be (1) deductive and rely on theoretical propositions; (2) inductive and analyze data from the “ground up”; (3) organized to create a case description; or (4) used to examine plausible rival explanations. According to Yin’s 40 approach to descriptive case studies, carefully considering theory development is an important part of study design. “Theory” refers to field-relevant propositions, commonly agreed upon assumptions, or fully developed theories. 40 Stake 5 advocates for using the researcher’s intuition and impression to guide analysis through a categorical aggregation and direct interpretation. Merriam 24 uses six different methods to guide the “process of making meaning” (p178) : (1) ethnographic analysis; (2) narrative analysis; (3) phenomenological analysis; (4) constant comparative method; (5) content analysis; and (6) analytic induction.

Drawing upon a theoretical or conceptual framework to inform analysis improves the quality of case study and avoids the risk of description without meaning. 18 Using Stake’s 5 approach, researchers rely on protocols and previous knowledge to help make sense of new ideas; theory can guide the research and assist researchers in understanding how new information fits into existing knowledge.

Practical applications of case study research

Columbia University has recently demonstrated how case studies can help train future health leaders. 41 Case studies encompass components of systems thinking—considering connections and interactions between components of a system, alongside the implications and consequences of those relationships—to equip health leaders with tools to tackle global health issues. 41 Greenwood 42 evaluated Indigenous peoples’ relationship with the healthcare system in British Columbia and used a case study to challenge and educate health leaders across the country to enhance culturally sensitive health service environments.

An important but often omitted step in case study research is an assessment of quality and rigour. We recommend using a framework or set of criteria to assess the rigour of the qualitative research. Suitable resources include Caelli et al., 43 Houghten et al., 44 Ravenek and Rudman, 45 and Tracy. 46

New directions in case study

Although “pragmatic” case studies (ie, utilizing practical and applicable methods) have existed within psychotherapy for some time, 47 , 48 only recently has the applicability of pragmatism as an underlying paradigmatic perspective been considered in HSR. 49 This is marked by uptake of pragmatism in Randomized Control Trials, recognizing that “gold standard” testing conditions do not reflect the reality of clinical settings 50 , 51 nor do a handful of epistemologically guided methodologies suit every research inquiry.

Pragmatism positions the research question as the basis for methodological choices, rather than a theory or epistemology, allowing researchers to pursue the most practical approach to understanding a problem or discovering an actionable solution. 52 Mixed methods are commonly used to create a deeper understanding of the case through converging qualitative and quantitative data. 52 Pragmatic case study is suited to HSR because its flexibility throughout the research process accommodates complexity, ever-changing systems, and disruptions to research plans. 49 , 50 Much like case study, pragmatism has been criticized for its flexibility and use when other approaches are seemingly ill-fit. 53 , 54 Similarly, authors argue that this results from a lack of investigation and proper application rather than a reflection of validity, legitimizing the need for more exploration and conversation among researchers and practitioners. 55

Although occasionally misunderstood as a less rigourous research methodology, 8 case study research is highly flexible and allows for contextual nuances. 5 , 6 Its use is valuable when the researcher desires a thorough understanding of a phenomenon or case bound by context. 11 If needed, multiple similar cases can be studied simultaneously, or one case within another. 16 , 17 There are currently three main approaches to case study, 5 , 17 , 24 each with their own definitions of a case, ontological and epistemological paradigms, methodologies, and data collection and analysis procedures. 37

Individuals’ experiences within health systems are influenced heavily by contextual factors, participant experience, and intricate relationships between different organizations and actors. 55 Case study research is well suited for HSR because it can track and examine these complex relationships and systems as they evolve over time. 6 , 7 It is important that researchers and health leaders using this methodology understand its key tenets and how to conduct a proper case study. Although there are many examples of case study in action, they are often under-reported and, when reported, not rigorously conducted. 9 Thus, decision-makers and health leaders should use these examples with caution. The proper reporting of case studies is necessary to bolster their credibility in HSR literature and provide readers sufficient information to critically assess the methodology. We also call on health leaders who frequently use case studies 56 – 58 to report them in the primary research literature.

The purpose of this article is to advocate for the continued and advanced use of case study in HSR and to provide literature-based guidance for decision-makers, policy-makers, and health leaders on how to engage in, read, and interpret findings from case study research. As health systems progress and evolve, the application of case study research will continue to increase as researchers and health leaders aim to capture the inherent complexities, nuances, and contextual factors. 7

An external file that holds a picture, illustration, etc.
Object name is 10.1177_08404704211028857-img1.jpg

Case Study Research Method in Psychology

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

Case studies are in-depth investigations of a person, group, event, or community. Typically, data is gathered from various sources using several methods (e.g., observations & interviews).

The case study research method originated in clinical medicine (the case history, i.e., the patient’s personal history). In psychology, case studies are often confined to the study of a particular individual.

The information is mainly biographical and relates to events in the individual’s past (i.e., retrospective), as well as to significant events that are currently occurring in his or her everyday life.

The case study is not a research method, but researchers select methods of data collection and analysis that will generate material suitable for case studies.

Freud (1909a, 1909b) conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

This makes it clear that the case study is a method that should only be used by a psychologist, therapist, or psychiatrist, i.e., someone with a professional qualification.

There is an ethical issue of competence. Only someone qualified to diagnose and treat a person can conduct a formal case study relating to atypical (i.e., abnormal) behavior or atypical development.

case study

 Famous Case Studies

  • Anna O – One of the most famous case studies, documenting psychoanalyst Josef Breuer’s treatment of “Anna O” (real name Bertha Pappenheim) for hysteria in the late 1800s using early psychoanalytic theory.
  • Little Hans – A child psychoanalysis case study published by Sigmund Freud in 1909 analyzing his five-year-old patient Herbert Graf’s house phobia as related to the Oedipus complex.
  • Bruce/Brenda – Gender identity case of the boy (Bruce) whose botched circumcision led psychologist John Money to advise gender reassignment and raise him as a girl (Brenda) in the 1960s.
  • Genie Wiley – Linguistics/psychological development case of the victim of extreme isolation abuse who was studied in 1970s California for effects of early language deprivation on acquiring speech later in life.
  • Phineas Gage – One of the most famous neuropsychology case studies analyzes personality changes in railroad worker Phineas Gage after an 1848 brain injury involving a tamping iron piercing his skull.

Clinical Case Studies

  • Studying the effectiveness of psychotherapy approaches with an individual patient
  • Assessing and treating mental illnesses like depression, anxiety disorders, PTSD
  • Neuropsychological cases investigating brain injuries or disorders

Child Psychology Case Studies

  • Studying psychological development from birth through adolescence
  • Cases of learning disabilities, autism spectrum disorders, ADHD
  • Effects of trauma, abuse, deprivation on development

Types of Case Studies

  • Explanatory case studies : Used to explore causation in order to find underlying principles. Helpful for doing qualitative analysis to explain presumed causal links.
  • Exploratory case studies : Used to explore situations where an intervention being evaluated has no clear set of outcomes. It helps define questions and hypotheses for future research.
  • Descriptive case studies : Describe an intervention or phenomenon and the real-life context in which it occurred. It is helpful for illustrating certain topics within an evaluation.
  • Multiple-case studies : Used to explore differences between cases and replicate findings across cases. Helpful for comparing and contrasting specific cases.
  • Intrinsic : Used to gain a better understanding of a particular case. Helpful for capturing the complexity of a single case.
  • Collective : Used to explore a general phenomenon using multiple case studies. Helpful for jointly studying a group of cases in order to inquire into the phenomenon.

Where Do You Find Data for a Case Study?

There are several places to find data for a case study. The key is to gather data from multiple sources to get a complete picture of the case and corroborate facts or findings through triangulation of evidence. Most of this information is likely qualitative (i.e., verbal description rather than measurement), but the psychologist might also collect numerical data.

1. Primary sources

  • Interviews – Interviewing key people related to the case to get their perspectives and insights. The interview is an extremely effective procedure for obtaining information about an individual, and it may be used to collect comments from the person’s friends, parents, employer, workmates, and others who have a good knowledge of the person, as well as to obtain facts from the person him or herself.
  • Observations – Observing behaviors, interactions, processes, etc., related to the case as they unfold in real-time.
  • Documents & Records – Reviewing private documents, diaries, public records, correspondence, meeting minutes, etc., relevant to the case.

2. Secondary sources

  • News/Media – News coverage of events related to the case study.
  • Academic articles – Journal articles, dissertations etc. that discuss the case.
  • Government reports – Official data and records related to the case context.
  • Books/films – Books, documentaries or films discussing the case.

3. Archival records

Searching historical archives, museum collections and databases to find relevant documents, visual/audio records related to the case history and context.

Public archives like newspapers, organizational records, photographic collections could all include potentially relevant pieces of information to shed light on attitudes, cultural perspectives, common practices and historical contexts related to psychology.

4. Organizational records

Organizational records offer the advantage of often having large datasets collected over time that can reveal or confirm psychological insights.

Of course, privacy and ethical concerns regarding confidential data must be navigated carefully.

However, with proper protocols, organizational records can provide invaluable context and empirical depth to qualitative case studies exploring the intersection of psychology and organizations.

  • Organizational/industrial psychology research : Organizational records like employee surveys, turnover/retention data, policies, incident reports etc. may provide insight into topics like job satisfaction, workplace culture and dynamics, leadership issues, employee behaviors etc.
  • Clinical psychology : Therapists/hospitals may grant access to anonymized medical records to study aspects like assessments, diagnoses, treatment plans etc. This could shed light on clinical practices.
  • School psychology : Studies could utilize anonymized student records like test scores, grades, disciplinary issues, and counseling referrals to study child development, learning barriers, effectiveness of support programs, and more.

How do I Write a Case Study in Psychology?

Follow specified case study guidelines provided by a journal or your psychology tutor. General components of clinical case studies include: background, symptoms, assessments, diagnosis, treatment, and outcomes. Interpreting the information means the researcher decides what to include or leave out. A good case study should always clarify which information is the factual description and which is an inference or the researcher’s opinion.

1. Introduction

  • Provide background on the case context and why it is of interest, presenting background information like demographics, relevant history, and presenting problem.
  • Compare briefly to similar published cases if applicable. Clearly state the focus/importance of the case.

2. Case Presentation

  • Describe the presenting problem in detail, including symptoms, duration,and impact on daily life.
  • Include client demographics like age and gender, information about social relationships, and mental health history.
  • Describe all physical, emotional, and/or sensory symptoms reported by the client.
  • Use patient quotes to describe the initial complaint verbatim. Follow with full-sentence summaries of relevant history details gathered, including key components that led to a working diagnosis.
  • Summarize clinical exam results, namely orthopedic/neurological tests, imaging, lab tests, etc. Note actual results rather than subjective conclusions. Provide images if clearly reproducible/anonymized.
  • Clearly state the working diagnosis or clinical impression before transitioning to management.

3. Management and Outcome

  • Indicate the total duration of care and number of treatments given over what timeframe. Use specific names/descriptions for any therapies/interventions applied.
  • Present the results of the intervention,including any quantitative or qualitative data collected.
  • For outcomes, utilize visual analog scales for pain, medication usage logs, etc., if possible. Include patient self-reports of improvement/worsening of symptoms. Note the reason for discharge/end of care.

4. Discussion

  • Analyze the case, exploring contributing factors, limitations of the study, and connections to existing research.
  • Analyze the effectiveness of the intervention,considering factors like participant adherence, limitations of the study, and potential alternative explanations for the results.
  • Identify any questions raised in the case analysis and relate insights to established theories and current research if applicable. Avoid definitive claims about physiological explanations.
  • Offer clinical implications, and suggest future research directions.

5. Additional Items

  • Thank specific assistants for writing support only. No patient acknowledgments.
  • References should directly support any key claims or quotes included.
  • Use tables/figures/images only if substantially informative. Include permissions and legends/explanatory notes.
  • Provides detailed (rich qualitative) information.
  • Provides insight for further research.
  • Permitting investigation of otherwise impractical (or unethical) situations.

Case studies allow a researcher to investigate a topic in far more detail than might be possible if they were trying to deal with a large number of research participants (nomothetic approach) with the aim of ‘averaging’.

Because of their in-depth, multi-sided approach, case studies often shed light on aspects of human thinking and behavior that would be unethical or impractical to study in other ways.

Research that only looks into the measurable aspects of human behavior is not likely to give us insights into the subjective dimension of experience, which is important to psychoanalytic and humanistic psychologists.

Case studies are often used in exploratory research. They can help us generate new ideas (that might be tested by other methods). They are an important way of illustrating theories and can help show how different aspects of a person’s life are related to each other.

The method is, therefore, important for psychologists who adopt a holistic point of view (i.e., humanistic psychologists ).

Limitations

  • Lacking scientific rigor and providing little basis for generalization of results to the wider population.
  • Researchers’ own subjective feelings may influence the case study (researcher bias).
  • Difficult to replicate.
  • Time-consuming and expensive.
  • The volume of data, together with the time restrictions in place, impacted the depth of analysis that was possible within the available resources.

Because a case study deals with only one person/event/group, we can never be sure if the case study investigated is representative of the wider body of “similar” instances. This means the conclusions drawn from a particular case may not be transferable to other settings.

Because case studies are based on the analysis of qualitative (i.e., descriptive) data , a lot depends on the psychologist’s interpretation of the information she has acquired.

This means that there is a lot of scope for Anna O , and it could be that the subjective opinions of the psychologist intrude in the assessment of what the data means.

For example, Freud has been criticized for producing case studies in which the information was sometimes distorted to fit particular behavioral theories (e.g., Little Hans ).

This is also true of Money’s interpretation of the Bruce/Brenda case study (Diamond, 1997) when he ignored evidence that went against his theory.

Breuer, J., & Freud, S. (1895).  Studies on hysteria . Standard Edition 2: London.

Curtiss, S. (1981). Genie: The case of a modern wild child .

Diamond, M., & Sigmundson, K. (1997). Sex Reassignment at Birth: Long-term Review and Clinical Implications. Archives of Pediatrics & Adolescent Medicine , 151(3), 298-304

Freud, S. (1909a). Analysis of a phobia of a five year old boy. In The Pelican Freud Library (1977), Vol 8, Case Histories 1, pages 169-306

Freud, S. (1909b). Bemerkungen über einen Fall von Zwangsneurose (Der “Rattenmann”). Jb. psychoanal. psychopathol. Forsch ., I, p. 357-421; GW, VII, p. 379-463; Notes upon a case of obsessional neurosis, SE , 10: 151-318.

Harlow J. M. (1848). Passage of an iron rod through the head.  Boston Medical and Surgical Journal, 39 , 389–393.

Harlow, J. M. (1868).  Recovery from the Passage of an Iron Bar through the Head .  Publications of the Massachusetts Medical Society. 2  (3), 327-347.

Money, J., & Ehrhardt, A. A. (1972).  Man & Woman, Boy & Girl : The Differentiation and Dimorphism of Gender Identity from Conception to Maturity. Baltimore, Maryland: Johns Hopkins University Press.

Money, J., & Tucker, P. (1975). Sexual signatures: On being a man or a woman.

Further Information

  • Case Study Approach
  • Case Study Method
  • Enhancing the Quality of Case Studies in Health Services Research
  • “We do things together” A case study of “couplehood” in dementia
  • Using mixed methods for evaluating an integrative approach to cancer care: a case study

Print Friendly, PDF & Email

  • Privacy Policy

Research Method

Home » Data Collection – Methods Types and Examples

Data Collection – Methods Types and Examples

Table of Contents

Data collection

Data Collection

Definition:

Data collection is the process of gathering and collecting information from various sources to analyze and make informed decisions based on the data collected. This can involve various methods, such as surveys, interviews, experiments, and observation.

In order for data collection to be effective, it is important to have a clear understanding of what data is needed and what the purpose of the data collection is. This can involve identifying the population or sample being studied, determining the variables to be measured, and selecting appropriate methods for collecting and recording data.

Types of Data Collection

Types of Data Collection are as follows:

Primary Data Collection

Primary data collection is the process of gathering original and firsthand information directly from the source or target population. This type of data collection involves collecting data that has not been previously gathered, recorded, or published. Primary data can be collected through various methods such as surveys, interviews, observations, experiments, and focus groups. The data collected is usually specific to the research question or objective and can provide valuable insights that cannot be obtained from secondary data sources. Primary data collection is often used in market research, social research, and scientific research.

Secondary Data Collection

Secondary data collection is the process of gathering information from existing sources that have already been collected and analyzed by someone else, rather than conducting new research to collect primary data. Secondary data can be collected from various sources, such as published reports, books, journals, newspapers, websites, government publications, and other documents.

Qualitative Data Collection

Qualitative data collection is used to gather non-numerical data such as opinions, experiences, perceptions, and feelings, through techniques such as interviews, focus groups, observations, and document analysis. It seeks to understand the deeper meaning and context of a phenomenon or situation and is often used in social sciences, psychology, and humanities. Qualitative data collection methods allow for a more in-depth and holistic exploration of research questions and can provide rich and nuanced insights into human behavior and experiences.

Quantitative Data Collection

Quantitative data collection is a used to gather numerical data that can be analyzed using statistical methods. This data is typically collected through surveys, experiments, and other structured data collection methods. Quantitative data collection seeks to quantify and measure variables, such as behaviors, attitudes, and opinions, in a systematic and objective way. This data is often used to test hypotheses, identify patterns, and establish correlations between variables. Quantitative data collection methods allow for precise measurement and generalization of findings to a larger population. It is commonly used in fields such as economics, psychology, and natural sciences.

Data Collection Methods

Data Collection Methods are as follows:

Surveys involve asking questions to a sample of individuals or organizations to collect data. Surveys can be conducted in person, over the phone, or online.

Interviews involve a one-on-one conversation between the interviewer and the respondent. Interviews can be structured or unstructured and can be conducted in person or over the phone.

Focus Groups

Focus groups are group discussions that are moderated by a facilitator. Focus groups are used to collect qualitative data on a specific topic.

Observation

Observation involves watching and recording the behavior of people, objects, or events in their natural setting. Observation can be done overtly or covertly, depending on the research question.

Experiments

Experiments involve manipulating one or more variables and observing the effect on another variable. Experiments are commonly used in scientific research.

Case Studies

Case studies involve in-depth analysis of a single individual, organization, or event. Case studies are used to gain detailed information about a specific phenomenon.

Secondary Data Analysis

Secondary data analysis involves using existing data that was collected for another purpose. Secondary data can come from various sources, such as government agencies, academic institutions, or private companies.

How to Collect Data

The following are some steps to consider when collecting data:

  • Define the objective : Before you start collecting data, you need to define the objective of the study. This will help you determine what data you need to collect and how to collect it.
  • Identify the data sources : Identify the sources of data that will help you achieve your objective. These sources can be primary sources, such as surveys, interviews, and observations, or secondary sources, such as books, articles, and databases.
  • Determine the data collection method : Once you have identified the data sources, you need to determine the data collection method. This could be through online surveys, phone interviews, or face-to-face meetings.
  • Develop a data collection plan : Develop a plan that outlines the steps you will take to collect the data. This plan should include the timeline, the tools and equipment needed, and the personnel involved.
  • Test the data collection process: Before you start collecting data, test the data collection process to ensure that it is effective and efficient.
  • Collect the data: Collect the data according to the plan you developed in step 4. Make sure you record the data accurately and consistently.
  • Analyze the data: Once you have collected the data, analyze it to draw conclusions and make recommendations.
  • Report the findings: Report the findings of your data analysis to the relevant stakeholders. This could be in the form of a report, a presentation, or a publication.
  • Monitor and evaluate the data collection process: After the data collection process is complete, monitor and evaluate the process to identify areas for improvement in future data collection efforts.
  • Ensure data quality: Ensure that the collected data is of high quality and free from errors. This can be achieved by validating the data for accuracy, completeness, and consistency.
  • Maintain data security: Ensure that the collected data is secure and protected from unauthorized access or disclosure. This can be achieved by implementing data security protocols and using secure storage and transmission methods.
  • Follow ethical considerations: Follow ethical considerations when collecting data, such as obtaining informed consent from participants, protecting their privacy and confidentiality, and ensuring that the research does not cause harm to participants.
  • Use appropriate data analysis methods : Use appropriate data analysis methods based on the type of data collected and the research objectives. This could include statistical analysis, qualitative analysis, or a combination of both.
  • Record and store data properly: Record and store the collected data properly, in a structured and organized format. This will make it easier to retrieve and use the data in future research or analysis.
  • Collaborate with other stakeholders : Collaborate with other stakeholders, such as colleagues, experts, or community members, to ensure that the data collected is relevant and useful for the intended purpose.

Applications of Data Collection

Data collection methods are widely used in different fields, including social sciences, healthcare, business, education, and more. Here are some examples of how data collection methods are used in different fields:

  • Social sciences : Social scientists often use surveys, questionnaires, and interviews to collect data from individuals or groups. They may also use observation to collect data on social behaviors and interactions. This data is often used to study topics such as human behavior, attitudes, and beliefs.
  • Healthcare : Data collection methods are used in healthcare to monitor patient health and track treatment outcomes. Electronic health records and medical charts are commonly used to collect data on patients’ medical history, diagnoses, and treatments. Researchers may also use clinical trials and surveys to collect data on the effectiveness of different treatments.
  • Business : Businesses use data collection methods to gather information on consumer behavior, market trends, and competitor activity. They may collect data through customer surveys, sales reports, and market research studies. This data is used to inform business decisions, develop marketing strategies, and improve products and services.
  • Education : In education, data collection methods are used to assess student performance and measure the effectiveness of teaching methods. Standardized tests, quizzes, and exams are commonly used to collect data on student learning outcomes. Teachers may also use classroom observation and student feedback to gather data on teaching effectiveness.
  • Agriculture : Farmers use data collection methods to monitor crop growth and health. Sensors and remote sensing technology can be used to collect data on soil moisture, temperature, and nutrient levels. This data is used to optimize crop yields and minimize waste.
  • Environmental sciences : Environmental scientists use data collection methods to monitor air and water quality, track climate patterns, and measure the impact of human activity on the environment. They may use sensors, satellite imagery, and laboratory analysis to collect data on environmental factors.
  • Transportation : Transportation companies use data collection methods to track vehicle performance, optimize routes, and improve safety. GPS systems, on-board sensors, and other tracking technologies are used to collect data on vehicle speed, fuel consumption, and driver behavior.

Examples of Data Collection

Examples of Data Collection are as follows:

  • Traffic Monitoring: Cities collect real-time data on traffic patterns and congestion through sensors on roads and cameras at intersections. This information can be used to optimize traffic flow and improve safety.
  • Social Media Monitoring : Companies can collect real-time data on social media platforms such as Twitter and Facebook to monitor their brand reputation, track customer sentiment, and respond to customer inquiries and complaints in real-time.
  • Weather Monitoring: Weather agencies collect real-time data on temperature, humidity, air pressure, and precipitation through weather stations and satellites. This information is used to provide accurate weather forecasts and warnings.
  • Stock Market Monitoring : Financial institutions collect real-time data on stock prices, trading volumes, and other market indicators to make informed investment decisions and respond to market fluctuations in real-time.
  • Health Monitoring : Medical devices such as wearable fitness trackers and smartwatches can collect real-time data on a person’s heart rate, blood pressure, and other vital signs. This information can be used to monitor health conditions and detect early warning signs of health issues.

Purpose of Data Collection

The purpose of data collection can vary depending on the context and goals of the study, but generally, it serves to:

  • Provide information: Data collection provides information about a particular phenomenon or behavior that can be used to better understand it.
  • Measure progress : Data collection can be used to measure the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Support decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions.
  • Identify trends : Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Monitor and evaluate : Data collection can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.

When to use Data Collection

Data collection is used when there is a need to gather information or data on a specific topic or phenomenon. It is typically used in research, evaluation, and monitoring and is important for making informed decisions and improving outcomes.

Data collection is particularly useful in the following scenarios:

  • Research : When conducting research, data collection is used to gather information on variables of interest to answer research questions and test hypotheses.
  • Evaluation : Data collection is used in program evaluation to assess the effectiveness of programs or interventions, and to identify areas for improvement.
  • Monitoring : Data collection is used in monitoring to track progress towards achieving goals or targets, and to identify any areas that require attention.
  • Decision-making: Data collection is used to provide decision-makers with information that can be used to inform policies, strategies, and actions.
  • Quality improvement : Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Characteristics of Data Collection

Data collection can be characterized by several important characteristics that help to ensure the quality and accuracy of the data gathered. These characteristics include:

  • Validity : Validity refers to the accuracy and relevance of the data collected in relation to the research question or objective.
  • Reliability : Reliability refers to the consistency and stability of the data collection process, ensuring that the results obtained are consistent over time and across different contexts.
  • Objectivity : Objectivity refers to the impartiality of the data collection process, ensuring that the data collected is not influenced by the biases or personal opinions of the data collector.
  • Precision : Precision refers to the degree of accuracy and detail in the data collected, ensuring that the data is specific and accurate enough to answer the research question or objective.
  • Timeliness : Timeliness refers to the efficiency and speed with which the data is collected, ensuring that the data is collected in a timely manner to meet the needs of the research or evaluation.
  • Ethical considerations : Ethical considerations refer to the ethical principles that must be followed when collecting data, such as ensuring confidentiality and obtaining informed consent from participants.

Advantages of Data Collection

There are several advantages of data collection that make it an important process in research, evaluation, and monitoring. These advantages include:

  • Better decision-making : Data collection provides decision-makers with evidence-based information that can be used to inform policies, strategies, and actions, leading to better decision-making.
  • Improved understanding: Data collection helps to improve our understanding of a particular phenomenon or behavior by providing empirical evidence that can be analyzed and interpreted.
  • Evaluation of interventions: Data collection is essential in evaluating the effectiveness of interventions or programs designed to address a particular issue or problem.
  • Identifying trends and patterns: Data collection can help identify trends and patterns over time that may indicate changes in behaviors or outcomes.
  • Increased accountability: Data collection increases accountability by providing evidence that can be used to monitor and evaluate the implementation and impact of policies, programs, and initiatives.
  • Validation of theories: Data collection can be used to test hypotheses and validate theories, leading to a better understanding of the phenomenon being studied.
  • Improved quality: Data collection is used in quality improvement efforts to identify areas where improvements can be made and to measure progress towards achieving goals.

Limitations of Data Collection

While data collection has several advantages, it also has some limitations that must be considered. These limitations include:

  • Bias : Data collection can be influenced by the biases and personal opinions of the data collector, which can lead to inaccurate or misleading results.
  • Sampling bias : Data collection may not be representative of the entire population, resulting in sampling bias and inaccurate results.
  • Cost : Data collection can be expensive and time-consuming, particularly for large-scale studies.
  • Limited scope: Data collection is limited to the variables being measured, which may not capture the entire picture or context of the phenomenon being studied.
  • Ethical considerations : Data collection must follow ethical principles to protect the rights and confidentiality of the participants, which can limit the type of data that can be collected.
  • Data quality issues: Data collection may result in data quality issues such as missing or incomplete data, measurement errors, and inconsistencies.
  • Limited generalizability : Data collection may not be generalizable to other contexts or populations, limiting the generalizability of the findings.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Research Questions

Research Questions – Types, Examples and Writing...

The Case Study: Methods of Data Collection

  • First Online: 06 September 2017

Cite this chapter

case study method data collection

  • Farideh Delavari Edalat 6 &
  • M. Reza Abdi 7  

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 258))

972 Accesses

This chapter  concerns with the methodology choice which affected the process and outcomes of this book. The chapter  identifies a case study on the basis of data collection from the semi-structured interviews to establish the knowledge required for the conceptual framework of AWM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

David M, Sutton CD (2004) Social research, the basics. Sage Publications, London

Google Scholar  

Fox N, Hunn A, Mathers N (2007) Sampling and sample size calculation. The NIHR RDS for the East Midlands/Yorkshire & the Humber, NHS. http://www.rds-yh.nihr.ac.uk/ . Accessed 04 Oct 2014

Saunders M, Lewis L, Thornhill A (2003) Research methods for business students. Pearson Education Limited, Essex

Saunders M, Lewis L, Thornhill A (2009) Research methods for business students, 5th edn. Pearson Education Limited, Essex

Saunders M, Lewis P, Thornhill A (2012) Research methods for business students, 6th edn. England, Pearson Education Limited

Sunderland E (1968) Pastoralism, nomadism and the social anthropology in Iran. In: Fisher WB (ed) The Cambridge history of Iran, vol I. Cambridge University Press, The Land of Iran. Cambridge, pp 611–683

Chapter   Google Scholar  

Tomas MK (2006) Collaboration for sustainability? A framework for analysing government impacts in collaborative environmental management. Sustain Sci Pract Policy 2(1):15–24

Vogt WP (1999) Dictionary of statistics and methodology: a nontechnical guide for the social sciences. Sage, London

Download references

Author information

Authors and affiliations.

Environment and Sustainability Consultant, Additive Design Ltd, Leeds, West Yorkshire, UK

Farideh Delavari Edalat

Operations and Information Management, School of Management, University of Bradford, Bradford, West Yorkshire, UK

M. Reza Abdi

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Edalat, F.D., Abdi, M.R. (2018). The Case Study: Methods of Data Collection. In: Adaptive Water Management. International Series in Operations Research & Management Science, vol 258. Springer, Cham. https://doi.org/10.1007/978-3-319-64143-0_6

Download citation

DOI : https://doi.org/10.1007/978-3-319-64143-0_6

Published : 06 September 2017

Publisher Name : Springer, Cham

Print ISBN : 978-3-319-64142-3

Online ISBN : 978-3-319-64143-0

eBook Packages : Business and Management Business and Management (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Green Garage

Case Study Method – 18 Advantages and Disadvantages

The case study method uses investigatory research as a way to collect data about specific demographics. This approach can apply to individuals, businesses, groups, or events. Each participant receives an equal amount of participation, offering information for collection that can then find new insights into specific trends, ideas, of hypotheses.

Interviews and research observation are the two standard methods of data collection used when following the case study method.

Researchers initially developed the case study method to develop and support hypotheses in clinical medicine. The benefits found in these efforts led the approach to transition to other industries, allowing for the examination of results through proposed decisions, processes, or outcomes. Its unique approach to information makes it possible for others to glean specific points of wisdom that encourage growth.

Several case study method advantages and disadvantages can appear when researchers take this approach.

List of the Advantages of the Case Study Method

1. It requires an intensive study of a specific unit. Researchers must document verifiable data from direct observations when using the case study method. This work offers information about the input processes that go into the hypothesis under consideration. A casual approach to data-gathering work is not effective if a definitive outcome is desired. Each behavior, choice, or comment is a critical component that can verify or dispute the ideas being considered.

Intensive programs can require a significant amount of work for researchers, but it can also promote an improvement in the data collected. That means a hypothesis can receive immediate verification in some situations.

2. No sampling is required when following the case study method. This research method studies social units in their entire perspective instead of pulling individual data points out to analyze them. That means there is no sampling work required when using the case study method. The hypothesis under consideration receives support because it works to turn opinions into facts, verifying or denying the proposals that outside observers can use in the future.

Although researchers might pay attention to specific incidents or outcomes based on generalized behaviors or ideas, the study itself won’t sample those situations. It takes a look at the “bigger vision” instead.

3. This method offers a continuous analysis of the facts. The case study method will look at the facts continuously for the social group being studied by researchers. That means there aren’t interruptions in the process that could limit the validity of the data being collected through this work. This advantage reduces the need to use assumptions when drawing conclusions from the information, adding validity to the outcome of the study over time. That means the outcome becomes relevant to both sides of the equation as it can prove specific suppositions or invalidate a hypothesis under consideration.

This advantage can lead to inefficiencies because of the amount of data being studied by researchers. It is up to the individuals involved in the process to sort out what is useful and meaningful and what is not.

4. It is a useful approach to take when formulating a hypothesis. Researchers will use the case study method advantages to verify a hypothesis under consideration. It is not unusual for the collected data to lead people toward the formulation of new ideas after completing this work. This process encourages further study because it allows concepts to evolve as people do in social or physical environments. That means a complete data set can be gathered based on the skills of the researcher and the honesty of the individuals involved in the study itself.

Although this approach won’t develop a societal-level evaluation of a hypothesis, it can look at how specific groups will react in various circumstances. That information can lead to a better decision-making process in the future for everyone involved.

5. It provides an increase in knowledge. The case study method provides everyone with analytical power to increase knowledge. This advantage is possible because it uses a variety of methodologies to collect information while evaluating a hypothesis. Researchers prefer to use direct observation and interviews to complete their work, but it can also advantage through the use of questionnaires. Participants might need to fill out a journal or diary about their experiences that can be used to study behaviors or choices.

Some researchers incorporate memory tests and experimental tasks to determine how social groups will interact or respond in specific situations. All of this data then works to verify the possibilities that a hypothesis proposes.

6. The case study method allows for comparisons. The human experience is one that is built on individual observations from group situations. Specific demographics might think, act, or respond in particular ways to stimuli, but each person in that group will also contribute a small part to the whole. You could say that people are sponges that collect data from one another every day to create individual outcomes.

The case study method allows researchers to take the information from each demographic for comparison purposes. This information can then lead to proposals that support a hypothesis or lead to its disruption.

7. Data generalization is possible using the case study method. The case study method provides a foundation for data generalization, allowing researches to illustrate their statistical findings in meaningful ways. It puts the information into a usable format that almost anyone can use if they have the need to evaluate the hypothesis under consideration. This process makes it easier to discover unusual features, unique outcomes, or find conclusions that wouldn’t be available without this method. It does an excellent job of identifying specific concepts that relate to the proposed ideas that researchers were verifying through their work.

Generalization does not apply to a larger population group with the case study method. What researchers can do with this information is to suggest a predictable outcome when similar groups are placed in an equal situation.

8. It offers a comprehensive approach to research. Nothing gets ignored when using the case study method to collect information. Every person, place, or thing involved in the research receives the complete attention of those seeking data. The interactions are equal, which means the data is comprehensive and directly reflective of the group being observed.

This advantage means that there are fewer outliers to worry about when researching an idea, leading to a higher level of accuracy in the conclusions drawn by the researchers.

9. The identification of deviant cases is possible with this method. The case study method of research makes it easier to identify deviant cases that occur in each social group. These incidents are units (people) that behave in ways that go against the hypothesis under consideration. Instead of ignoring them like other options do when collecting data, this approach incorporates the “rogue” behavior to understand why it exists in the first place.

This advantage makes the eventual data and conclusions gathered more reliable because it incorporates the “alternative opinion” that exists. One might say that the case study method places as much emphasis on the yin as it does the yang so that the whole picture becomes available to the outside observer.

10. Questionnaire development is possible with the case study method. Interviews and direct observation are the preferred methods of implementing the case study method because it is cheap and done remotely. The information gathered by researchers can also lead to farming questionnaires that can farm additional data from those being studied. When all of the data resources come together, it is easier to formulate a conclusion that accurately reflects the demographics.

Some people in the case study method may try to manipulate the results for personal reasons, but this advantage makes it possible to identify this information readily. Then researchers can look into the thinking that goes into the dishonest behaviors observed.

List of the Disadvantages of the Case Study Method

1. The case study method offers limited representation. The usefulness of the case study method is limited to a specific group of representatives. Researchers are looking at a specific demographic when using this option. That means it is impossible to create any generalization that applies to the rest of society, an organization, or a larger community with this work. The findings can only apply to other groups caught in similar circumstances with the same experiences.

It is useful to use the case study method when attempting to discover the specific reasons why some people behave in a specific way. If researchers need something more generalized, then a different method must be used.

2. No classification is possible with the case study method. This disadvantage is also due to the sample size in the case study method. No classification is possible because researchers are studying such a small unit, group, or demographic. It can be an inefficient process since the skills of the researcher help to determine the quality of the data being collected to verify the validity of a hypothesis. Some participants may be unwilling to answer or participate, while others might try to guess at the outcome to support it.

Researchers can get trapped in a place where they explore more tangents than the actual hypothesis with this option. Classification can occur within the units being studied, but this data cannot extrapolate to other demographics.

3. The case study method still offers the possibility of errors. Each person has an unconscious bias that influences their behaviors and choices. The case study method can find outliers that oppose a hypothesis fairly easily thanks to its emphasis on finding facts, but it is up to the researchers to determine what information qualifies for this designation. If the results from the case study method are surprising or go against the opinion of participating individuals, then there is still the possibility that the information will not be 100% accurate.

Researchers must have controls in place that dictate how data gathering work occurs. Without this limitation in place, the results of the study cannot be guaranteed because of the presence of bias.

4. It is a subjective method to use for research. Although the purpose of the case study method of research is to gather facts, the foundation of what gets gathered is still based on opinion. It uses the subjective method instead of the objective one when evaluating data, which means there can be another layer of errors in the information to consider.

Imagine that a researcher interprets someone’s response as “angry” when performing direct observation, but the individual was feeling “shame” because of a decision they made. The difference between those two emotions is profound, and it could lead to information disruptions that could be problematic to the eventual work of hypothesis verification.

5. The processes required by the case study method are not useful for everyone. The case study method uses a person’s memories, explanations, and records from photographs and diaries to identify interactions on influences on psychological processes. People are given the chance to describe what happens in the world around them as a way for researchers to gather data. This process can be an advantage in some industries, but it can also be a worthless approach to some groups.

If the social group under study doesn’t have the information, knowledge, or wisdom to provide meaningful data, then the processes are no longer useful. Researchers must weigh the advantages and disadvantages of the case study method before starting their work to determine if the possibility of value exists. If it does not, then a different method may be necessary.

6. It is possible for bias to form in the data. It’s not just an unconscious bias that can form in the data when using the case study method. The narrow study approach can lead to outright discrimination in the data. Researchers can decide to ignore outliers or any other information that doesn’t support their hypothesis when using this method. The subjective nature of this approach makes it difficult to challenge the conclusions that get drawn from this work, and the limited pool of units (people) means that duplication is almost impossible.

That means unethical people can manipulate the results gathered by the case study method to their own advantage without much accountability in the process.

7. This method has no fixed limits to it. This method of research is highly dependent on situational circumstances rather than overarching societal or corporate truths. That means the researcher has no fixed limits of investigation. Even when controls are in place to limit bias or recommend specific activities, the case study method has enough flexibility built into its structures to allow for additional exploration. That means it is possible for this work to continue indefinitely, gathering data that never becomes useful.

Scientists began to track the health of 268 sophomores at Harvard in 1938. The Great Depression was in its final years at that point, so the study hoped to reveal clues that lead to happy and healthy lives. It continues still today, now incorporating the children of the original participants, providing over 80 years of information to sort through for conclusions.

8. The case study method is time-consuming and expensive. The case study method can be affordable in some situations, but the lack of fixed limits and the ability to pursue tangents can make it a costly process in most situations. It takes time to gather the data in the first place, and then researchers must interpret the information received so that they can use it for hypothesis evaluation. There are other methods of data collection that can be less expensive and provide results faster.

That doesn’t mean the case study method is useless. The individualization of results can help the decision-making process advance in a variety of industries successfully. It just takes more time to reach the appropriate conclusion, and that might be a resource that isn’t available.

The advantages and disadvantages of the case study method suggest that the helpfulness of this research option depends on the specific hypothesis under consideration. When researchers have the correct skills and mindset to gather data accurately, then it can lead to supportive data that can verify ideas with tremendous accuracy.

This research method can also be used unethically to produce specific results that can be difficult to challenge.

When bias enters into the structure of the case study method, the processes become inefficient, inaccurate, and harmful to the hypothesis. That’s why great care must be taken when designing a study with this approach. It might be a labor-intensive way to develop conclusions, but the outcomes are often worth the investments needed.

case study method data collection

Data Analytics Case Study Guide 2024

by Sam McKay, CFA | Data Analytics

case study method data collection

Data analytics case studies reveal how businesses harness data for informed decisions and growth.

For aspiring data professionals, mastering the case study process will enhance your skills and increase your career prospects.

Sales Now On Advertisement

So, how do you approach a case study?

Use these steps to process a data analytics case study:

Understand the Problem: Grasp the core problem or question addressed in the case study.

Collect Relevant Data: Gather data from diverse sources, ensuring accuracy and completeness.

Apply Analytical Techniques: Use appropriate methods aligned with the problem statement.

Visualize Insights: Utilize visual aids to showcase patterns and key findings.

Derive Actionable Insights: Focus on deriving meaningful actions from the analysis.

This article will give you detailed steps to navigate a case study effectively and understand how it works in real-world situations.

By the end of the article, you will be better equipped to approach a data analytics case study, strengthening your analytical prowess and practical application skills.

Let’s dive in!

Data Analytics Case Study Guide

Table of Contents

What is a Data Analytics Case Study?

A data analytics case study is a real or hypothetical scenario where analytics techniques are applied to solve a specific problem or explore a particular question.

It’s a practical approach that uses data analytics methods, assisting in deciphering data for meaningful insights. This structured method helps individuals or organizations make sense of data effectively.

Additionally, it’s a way to learn by doing, where there’s no single right or wrong answer in how you analyze the data.

So, what are the components of a case study?

Key Components of a Data Analytics Case Study

Key Components of a Data Analytics Case Study

A data analytics case study comprises essential elements that structure the analytical journey:

Problem Context: A case study begins with a defined problem or question. It provides the context for the data analysis , setting the stage for exploration and investigation.

Data Collection and Sources: It involves gathering relevant data from various sources , ensuring data accuracy, completeness, and relevance to the problem at hand.

Analysis Techniques: Case studies employ different analytical methods, such as statistical analysis, machine learning algorithms, or visualization tools, to derive meaningful conclusions from the collected data.

Insights and Recommendations: The ultimate goal is to extract actionable insights from the analyzed data, offering recommendations or solutions that address the initial problem or question.

Now that you have a better understanding of what a data analytics case study is, let’s talk about why we need and use them.

Why Case Studies are Integral to Data Analytics

Why Case Studies are Integral to Data Analytics

Case studies serve as invaluable tools in the realm of data analytics, offering multifaceted benefits that bolster an analyst’s proficiency and impact:

Real-Life Insights and Skill Enhancement: Examining case studies provides practical, real-life examples that expand knowledge and refine skills. These examples offer insights into diverse scenarios, aiding in a data analyst’s growth and expertise development.

Validation and Refinement of Analyses: Case studies demonstrate the effectiveness of data-driven decisions across industries, providing validation for analytical approaches. They showcase how organizations benefit from data analytics. Also, this helps in refining one’s own methodologies

Showcasing Data Impact on Business Outcomes: These studies show how data analytics directly affects business results, like increasing revenue, reducing costs, or delivering other measurable advantages. Understanding these impacts helps articulate the value of data analytics to stakeholders and decision-makers.

Learning from Successes and Failures: By exploring a case study, analysts glean insights from others’ successes and failures, acquiring new strategies and best practices. This learning experience facilitates professional growth and the adoption of innovative approaches within their own data analytics work.

Including case studies in a data analyst’s toolkit helps gain more knowledge, improve skills, and understand how data analytics affects different industries.

Using these real-life examples boosts confidence and success, guiding analysts to make better and more impactful decisions in their organizations.

But not all case studies are the same.

Let’s talk about the different types.

Types of Data Analytics Case Studies

 Types of Data Analytics Case Studies

Data analytics encompasses various approaches tailored to different analytical goals:

Exploratory Case Study: These involve delving into new datasets to uncover hidden patterns and relationships, often without a predefined hypothesis. They aim to gain insights and generate hypotheses for further investigation.

Predictive Case Study: These utilize historical data to forecast future trends, behaviors, or outcomes. By applying predictive models, they help anticipate potential scenarios or developments.

Diagnostic Case Study: This type focuses on understanding the root causes or reasons behind specific events or trends observed in the data. It digs deep into the data to provide explanations for occurrences.

Prescriptive Case Study: This case study goes beyond analytics; it provides actionable recommendations or strategies derived from the analyzed data. They guide decision-making processes by suggesting optimal courses of action based on insights gained.

Each type has a specific role in using data to find important insights, helping in decision-making, and solving problems in various situations.

Regardless of the type of case study you encounter, here are some steps to help you process them.

Roadmap to Handling a Data Analysis Case Study

Roadmap to Handling a Data Analysis Case Study

Embarking on a data analytics case study requires a systematic approach, step-by-step, to derive valuable insights effectively.

Here are the steps to help you through the process:

Step 1: Understanding the Case Study Context: Immerse yourself in the intricacies of the case study. Delve into the industry context, understanding its nuances, challenges, and opportunities.

Data Mentor Advertisement

Identify the central problem or question the study aims to address. Clarify the objectives and expected outcomes, ensuring a clear understanding before diving into data analytics.

Step 2: Data Collection and Validation: Gather data from diverse sources relevant to the case study. Prioritize accuracy, completeness, and reliability during data collection. Conduct thorough validation processes to rectify inconsistencies, ensuring high-quality and trustworthy data for subsequent analysis.

Data Collection and Validation in case study

Step 3: Problem Definition and Scope: Define the problem statement precisely. Articulate the objectives and limitations that shape the scope of your analysis. Identify influential variables and constraints, providing a focused framework to guide your exploration.

Step 4: Exploratory Data Analysis (EDA): Leverage exploratory techniques to gain initial insights. Visualize data distributions, patterns, and correlations, fostering a deeper understanding of the dataset. These explorations serve as a foundation for more nuanced analysis.

Step 5: Data Preprocessing and Transformation: Cleanse and preprocess the data to eliminate noise, handle missing values, and ensure consistency. Transform data formats or scales as required, preparing the dataset for further analysis.

Data Preprocessing and Transformation in case study

Step 6: Data Modeling and Method Selection: Select analytical models aligning with the case study’s problem, employing statistical techniques, machine learning algorithms, or tailored predictive models.

In this phase, it’s important to develop data modeling skills. This helps create visuals of complex systems using organized data, which helps solve business problems more effectively.

Understand key data modeling concepts, utilize essential tools like SQL for database interaction, and practice building models from real-world scenarios.

Furthermore, strengthen data cleaning skills for accurate datasets, and stay updated with industry trends to ensure relevance.

Data Modeling and Method Selection in case study

Step 7: Model Evaluation and Refinement: Evaluate the performance of applied models rigorously. Iterate and refine models to enhance accuracy and reliability, ensuring alignment with the objectives and expected outcomes.

Step 8: Deriving Insights and Recommendations: Extract actionable insights from the analyzed data. Develop well-structured recommendations or solutions based on the insights uncovered, addressing the core problem or question effectively.

Step 9: Communicating Results Effectively: Present findings, insights, and recommendations clearly and concisely. Utilize visualizations and storytelling techniques to convey complex information compellingly, ensuring comprehension by stakeholders.

Communicating Results Effectively

Step 10: Reflection and Iteration: Reflect on the entire analysis process and outcomes. Identify potential improvements and lessons learned. Embrace an iterative approach, refining methodologies for continuous enhancement and future analyses.

This step-by-step roadmap provides a structured framework for thorough and effective handling of a data analytics case study.

Now, after handling data analytics comes a crucial step; presenting the case study.

Presenting Your Data Analytics Case Study

Presenting Your Data Analytics Case Study

Presenting a data analytics case study is a vital part of the process. When presenting your case study, clarity and organization are paramount.

To achieve this, follow these key steps:

Structuring Your Case Study: Start by outlining relevant and accurate main points. Ensure these points align with the problem addressed and the methodologies used in your analysis.

Crafting a Narrative with Data: Start with a brief overview of the issue, then explain your method and steps, covering data collection, cleaning, stats, and advanced modeling.

Visual Representation for Clarity: Utilize various visual aids—tables, graphs, and charts—to illustrate patterns, trends, and insights. Ensure these visuals are easy to comprehend and seamlessly support your narrative.

Visual Representation for Clarity

Highlighting Key Information: Use bullet points to emphasize essential information, maintaining clarity and allowing the audience to grasp key takeaways effortlessly. Bold key terms or phrases to draw attention and reinforce important points.

Addressing Audience Queries: Anticipate and be ready to answer audience questions regarding methods, assumptions, and results. Demonstrating a profound understanding of your analysis instills confidence in your work.

Integrity and Confidence in Delivery: Maintain a neutral tone and avoid exaggerated claims about findings. Present your case study with integrity, clarity, and confidence to ensure the audience appreciates and comprehends the significance of your work.

Integrity and Confidence in Delivery

By organizing your presentation well, telling a clear story through your analysis, and using visuals wisely, you can effectively share your data analytics case study.

This method helps people understand better, stay engaged, and draw valuable conclusions from your work.

We hope by now, you are feeling very confident processing a case study. But with any process, there are challenges you may encounter.

EDNA AI Advertisement

Key Challenges in Data Analytics Case Studies

Key Challenges in Data Analytics Case Studies

A data analytics case study can present various hurdles that necessitate strategic approaches for successful navigation:

Challenge 1: Data Quality and Consistency

Challenge: Inconsistent or poor-quality data can impede analysis, leading to erroneous insights and flawed conclusions.

Solution: Implement rigorous data validation processes, ensuring accuracy, completeness, and reliability. Employ data cleansing techniques to rectify inconsistencies and enhance overall data quality.

Challenge 2: Complexity and Scale of Data

Challenge: Managing vast volumes of data with diverse formats and complexities poses analytical challenges.

Solution: Utilize scalable data processing frameworks and tools capable of handling diverse data types. Implement efficient data storage and retrieval systems to manage large-scale datasets effectively.

Challenge 3: Interpretation and Contextual Understanding

Challenge: Interpreting data without contextual understanding or domain expertise can lead to misinterpretations.

Solution: Collaborate with domain experts to contextualize data and derive relevant insights. Invest in understanding the nuances of the industry or domain under analysis to ensure accurate interpretations.

Interpretation and Contextual Understanding

Challenge 4: Privacy and Ethical Concerns

Challenge: Balancing data access for analysis while respecting privacy and ethical boundaries poses a challenge.

Solution: Implement robust data governance frameworks that prioritize data privacy and ethical considerations. Ensure compliance with regulatory standards and ethical guidelines throughout the analysis process.

Challenge 5: Resource Limitations and Time Constraints

Challenge: Limited resources and time constraints hinder comprehensive analysis and exhaustive data exploration.

Solution: Prioritize key objectives and allocate resources efficiently. Employ agile methodologies to iteratively analyze and derive insights, focusing on the most impactful aspects within the given timeframe.

Recognizing these challenges is key; it helps data analysts adopt proactive strategies to mitigate obstacles. This enhances the effectiveness and reliability of insights derived from a data analytics case study.

Now, let’s talk about the best software tools you should use when working with case studies.

Top 5 Software Tools for Case Studies

Top Software Tools for Case Studies

In the realm of case studies within data analytics, leveraging the right software tools is essential.

Here are some top-notch options:

Tableau : Renowned for its data visualization prowess, Tableau transforms raw data into interactive, visually compelling representations, ideal for presenting insights within a case study.

Python and R Libraries: These flexible programming languages provide many tools for handling data, doing statistics, and working with machine learning, meeting various needs in case studies.

Microsoft Excel : A staple tool for data analytics, Excel provides a user-friendly interface for basic analytics, making it useful for initial data exploration in a case study.

SQL Databases : Structured Query Language (SQL) databases assist in managing and querying large datasets, essential for organizing case study data effectively.

Statistical Software (e.g., SPSS , SAS ): Specialized statistical software enables in-depth statistical analysis, aiding in deriving precise insights from case study data.

Choosing the best mix of these tools, tailored to each case study’s needs, greatly boosts analytical abilities and results in data analytics.

Final Thoughts

Case studies in data analytics are helpful guides. They give real-world insights, improve skills, and show how data-driven decisions work.

Using case studies helps analysts learn, be creative, and make essential decisions confidently in their data work.

Check out our latest clip below to further your learning!

Frequently Asked Questions

What are the key steps to analyzing a data analytics case study.

When analyzing a case study, you should follow these steps:

Clarify the problem : Ensure you thoroughly understand the problem statement and the scope of the analysis.

Make assumptions : Define your assumptions to establish a feasible framework for analyzing the case.

Gather context : Acquire relevant information and context to support your analysis.

Analyze the data : Perform calculations, create visualizations, and conduct statistical analysis on the data.

Provide insights : Draw conclusions and develop actionable insights based on your analysis.

How can you effectively interpret results during a data scientist case study job interview?

During your next data science interview, interpret case study results succinctly and clearly. Utilize visual aids and numerical data to bolster your explanations, ensuring comprehension.

Frame the results in an audience-friendly manner, emphasizing relevance. Concentrate on deriving insights and actionable steps from the outcomes.

How do you showcase your data analyst skills in a project?

To demonstrate your skills effectively, consider these essential steps. Begin by selecting a problem that allows you to exhibit your capacity to handle real-world challenges through analysis.

Methodically document each phase, encompassing data cleaning, visualization, statistical analysis, and the interpretation of findings.

Utilize descriptive analysis techniques and effectively communicate your insights using clear visual aids and straightforward language. Ensure your project code is well-structured, with detailed comments and documentation, showcasing your proficiency in handling data in an organized manner.

Lastly, emphasize your expertise in SQL queries, programming languages, and various analytics tools throughout the project. These steps collectively highlight your competence and proficiency as a skilled data analyst, demonstrating your capabilities within the project.

Can you provide an example of a successful data analytics project using key metrics?

A prime illustration is utilizing analytics in healthcare to forecast hospital readmissions. Analysts leverage electronic health records, patient demographics, and clinical data to identify high-risk individuals.

Implementing preventive measures based on these key metrics helps curtail readmission rates, enhancing patient outcomes and cutting healthcare expenses.

This demonstrates how data analytics, driven by metrics, effectively tackles real-world challenges, yielding impactful solutions.

Why would a company invest in data analytics?

Companies invest in data analytics to gain valuable insights, enabling informed decision-making and strategic planning. This investment helps optimize operations, understand customer behavior, and stay competitive in their industry.

Ultimately, leveraging data analytics empowers companies to make smarter, data-driven choices, leading to enhanced efficiency, innovation, and growth.

author avatar

Related Posts

4 Types of Data Analytics: Explained

4 Types of Data Analytics: Explained

Data Analytics

In a world full of data, data analytics is the heart and soul of an operation. It's what transforms raw...

Data Analytics Outsourcing: Pros and Cons Explained

Data Analytics Outsourcing: Pros and Cons Explained

In today's data-driven world, businesses are constantly swimming in a sea of information, seeking the...

What Does a Data Analyst Do on a Daily Basis?

What Does a Data Analyst Do on a Daily Basis?

In the digital age, data plays a significant role in helping organizations make informed decisions and...

case study method data collection

Numbers, Facts and Trends Shaping Your World

Read our research on:

Full Topic List

Regions & Countries

  • Publications
  • Our Methods
  • Short Reads
  • Tools & Resources

Read Our Research On:

Q&A: How – and why – we’re changing the way we study tech adoption

What share of U.S. adults have high-speed internet at home ? Own a smartphone? Use social media ?

Pew Research Center has long studied tech adoption by interviewing Americans over the phone. But starting with the publications released today, we’ll be reporting on these topics using our National Public Opinion Reference Survey (NPORS) instead. The biggest difference: NPORS participants are invited by postal mail and can respond to the survey via a paper questionnaire or online, rather than by phone.

To explain the thinking behind this change and its implications for our future work, here’s a conversation with Managing Director of Internet and Technology Research Monica Anderson and Research Associate Colleen McClain. This interview has been condensed and edited for clarity.

Pew Research Center has been tracking tech adoption in the United States for decades. Why is this area of study so important?

case study method data collection

Anderson: We see this research as foundational to understanding the broader impact that the internet, mobile technology and social media have on our society.

Americans have an array of digital tools that help them with everything from getting news to shopping to finding jobs. Studying how people are going online, which devices they own and which social media sites they use is crucial for understanding how they experience the world around them.

This research also anchors our ongoing work on the digital divide : the gap between those who have access to certain technologies and those who don’t. It shows us where demographic differences exist, if they’ve changed over time, and how factors like age, race and income may contribute.

Our surveys are an important reminder that some technologies, like high-speed internet, remain out of reach for some Americans, particularly those who are less affluent. In fact, our latest survey shows that about four-in-ten Americans living in lower-income households do not subscribe to home broadband.

Why is your team making the switch from phone surveys to the National Public Opinion Reference Survey (NPORS)?

case study method data collection

McClain: The internet hasn’t just transformed Americans’ everyday lives – it’s also transformed the way researchers study its impact. The changes we’ve made this year set us up to continue studying tech adoption long into the future.

We began tracking Americans’ tech use back in 2000. At that point, about half of Americans were online, and just 1% had broadband at home. Like much of the survey research world, we relied on telephone polling for these studies, and this approach served us well for decades.

But in more recent years, the share of people who respond to phone polls has plummeted , and these types of polls have become more costly. At the same time, online surveys have become more popular and pollsters’ methods have become more diverse . This transformation in polling is reflected in our online American Trends Panel , which works well for the vast majority of the Center’s U.S. survey work.

But there’s a caveat: Online-only surveys aren’t always the best approach when it comes to measuring certain types of data points. That includes measuring how many people don’t use technology in the first place.

Enter the National Public Opinion Reference Survey, which the Center launched in 2020 to meet these kinds of challenges. By giving people the choice to take our survey on paper or online, it is especially well-suited for hearing from Americans who don’t use the internet, aren’t comfortable with technology or just don’t want to respond online. That makes it a good fit for studying the digital divide. And NPORS achieves a higher response rate than phone polls .  

Shifting our tech adoption studies to NPORS ensures we’re keeping up with the latest advances in the Center’s methods toolkit, with quality at the forefront of this important work.

The internet hasn’t just transformed Americans’ everyday lives – it’s also transformed the way researchers study its impact. The changes we’ve made this year set us up to continue studying tech adoption long into the future. Colleen McClain

Are the old and new approaches comparable?

McClain: We took several steps to make our NPORS findings as comparable as possible with our earlier phone surveys. We knew that it can be tricky, and sometimes impossible, to directly compare the results of surveys that use different modes – that is, methods of interviewing. How a survey is conducted can affect how people answer questions and who responds in the first place. These are known as “mode effects.”

To try to minimize the impact of this change, we started by doing what we do best: gathering data.

Around the same time that we fielded our phone polls about tech adoption in 2019 and 2021, we also fielded some surveys using alternate approaches. We didn’t want to change the mode right away, but rather understand how any changes in our approach might affect the data we were collecting about how Americans use technology.

These test runs helped narrow our options and tweak the NPORS design. Using the 2019 and 2021 phone data we collected as a comparison point, we worked over the next few years to make the respondent experience as similar as possible across modes.

What does your new approach mean for your ability to talk about changes over time?

McClain: We carefully considered the potential for mode effects as we decided how to talk about the changes we saw in our findings this year. Even with all the work we did to make the approaches as comparable as possible, we wanted to be cautious.

For instance, we paid close attention to the size of any changes we observed. In some cases, the figures were fairly similar between 2021 and 2023, and even without the mode shift, we wouldn’t make too much of them.

We gave a thorough look at more striking differences. For example, 21% of Americans said they used TikTok in our 2021 phone survey, and that’s risen to 33% now in our paper/online survey. Going back to our test runs from earlier years helped us conclude it’s unlikely this change was all due to mode. We believe it also reflects real change over time.

While the mode shift makes it trickier than usual to talk about trends, we believe the change in approach is a net positive for the quality of our work. NPORS sets us up well for the future.

How are you communicating this mode shift in your published work?

A line chart showing that most U.S. adults have a smartphone, home broadband.

McClain: It’s important to us that readers can quickly and easily understand the shift and when it took place.

In some cases, we’ll be displaying the findings from our paper/online survey side by side with the data points from prior phone surveys. Trend charts in our reports signal the mode shift with a dotted line to draw attention to the change in approach. In our fact sheets , a vertical line conveys the same thing. In both cases, we also provide information in the footnotes below the chart itself.

In other places in our publications, we’re taking an even more cautious approach and focusing on the new data rather than on trends.

Did you have to change the way you asked survey questions?

McClain: Writing questions that keep up with the ever-changing nature of technology is always a challenge, and the mode shift complicated this further. For example, our previous phone surveys were conducted by interviewers, but taking surveys online or on paper doesn’t involve talking to someone. We needed to adapt our questions to keep the experience as consistent as possible on the new paper and online surveys.

Take who subscribes to home broadband, for example. Knowing we wouldn’t have an interviewer to probe and confirm someone’s response in the new modes, we tested out different options in advance to help us ensure we were collecting quality data.

In this case, we gave people a chance to say they were “not sure” or to write in a different type of internet connection, if the ones we offered didn’t quite fit their situation. We also updated the examples of internet connections in the question to be consistent with evolving technology.

Which findings from your latest survey stand out to you?

Anderson: There are several exciting things in our latest work, but two findings related to social media really stand out.

The first is the rise of TikTok. A third of U.S. adults – including about six-in-ten adults under 30 – use this video-based platform. These figures have significantly jumped since we last asked these questions in 2021. And separate surveys from the Center have found that TikTok is increasingly becoming a news source for Americans , especially young adults.

The second is how dominant Facebook remains. While its use has sharply declined among teens in the U.S. , most adults – about two-thirds – say they use the site. And this share has remained relatively stable over the past decade or so. YouTube is the only platform we asked about in our current survey that is more widely used than Facebook.

These findings reinforce why consistently tracking the use of technology, especially specific sites and apps, is so important. The online landscape can evolve quickly. As researchers who study these platforms, a forward-looking mindset is key. We’ll continue looking for new and emerging platforms while tracking longer-standing sites to see how use changes – or doesn’t – over time.

To learn more about the National Public Opinion Reference Survey, read our NPORS fact sheet . For more on Americans’ use of technology, read our new reports:

  • Americans’ Use of Mobile Technology and Home Broadband
  • Americans’ Social Media Use
  • Internet & Technology
  • Research Explainers
  • Survey Methods
  • Technology Adoption

Anna Jackson is an editorial assistant at Pew Research Center

6 facts about Americans and TikTok

Many americans think generative ai programs should credit the sources they rely on, americans’ use of chatgpt is ticking up, but few trust its election information, whatsapp and facebook dominate the social media landscape in middle-income nations, 5 facts about americans and sports, most popular.

1615 L St. NW, Suite 800 Washington, DC 20036 USA (+1) 202-419-4300 | Main (+1) 202-857-8562 | Fax (+1) 202-419-4372 |  Media Inquiries

Research Topics

  • Age & Generations
  • Coronavirus (COVID-19)
  • Economy & Work
  • Family & Relationships
  • Gender & LGBTQ
  • Immigration & Migration
  • International Affairs
  • Methodological Research
  • News Habits & Media
  • Non-U.S. Governments
  • Other Topics
  • Politics & Policy
  • Race & Ethnicity
  • Email Newsletters

ABOUT PEW RESEARCH CENTER  Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of  The Pew Charitable Trusts .

Copyright 2024 Pew Research Center

Terms & Conditions

Privacy Policy

Cookie Settings

Reprints, Permissions & Use Policy

Partition refinement of WorldPop population spatial distribution data method: A case study of Zhuhai, China

Affiliations.

  • 1 Chinese Academy of Surveying and Mapping, Beijing, China.
  • 2 School of Geomatics, Liaoning Technical University, Fuxin, China.
  • PMID: 38578753
  • PMCID: PMC10997122
  • DOI: 10.1371/journal.pone.0301127

Currently, the core idea of the refined method of population spatial distribution is to establish a correlation between the population and auxiliary data at the administrative-unit level and, then, refine it to the grid unit. However, this method ignores the advantages of public population spatial distribution data. Given these problems, this study proposed a partition strategy using the natural break method at the grid-unit level, which adopts the population density to constrain the land class weight and redistributes the population under the dual constraints of land class and area weights. Accordingly, we used the dasymetric method to refine the population distribution data. The study established a partition model for public population spatial distribution data and auxiliary data at the grid-unit level and, then, refined it to smaller grid units. This method effectively utilizes the public population spatial distribution data and solves the problem of the dataset being not sufficiently accurate to describe small-scale regions and low resolutions. Taking the public WorldPop population spatial distribution dataset as an example, the results indicate that the proposed method has higher accuracy than other public datasets and can also describe the actual spatial distribution characteristics of the population accurately and intuitively. Simultaneously, this provides a new concept for research on population spatial distribution refinement methods.

Copyright: © 2024 Zhao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

  • Population Density*

Grants and funding

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 23 April 2024

Prediction and optimization method for welding quality of components in ship construction

  • Jinfeng Liu 1 ,
  • Yifa Cheng 1 ,
  • Xuwen Jing 1 ,
  • Xiaojun Liu 2 &
  • Yu Chen 1  

Scientific Reports volume  14 , Article number:  9353 ( 2024 ) Cite this article

116 Accesses

Metrics details

  • Mechanical engineering

Welding process, as one of the crucial industrial technologies in ship construction, accounts for approximately 70% of the workload and costs account for approximately 40% of the total cost. The existing welding quality prediction methods have hypothetical premises and subjective factors, which cannot meet the dynamic control requirements of intelligent welding for processing quality. Aiming at the low efficiency of quality prediction problems poor timeliness and unpredictability of quality control in ship assembly-welding process, a data and model driven welding quality prediction method is proposed. Firstly, the influence factors of welding quality are analyzed and the correlation mechanism between process parameters and quality is determined. According to the analysis results, a stable and reliable data collection architecture is established. The elements of welding process monitoring are also determined based on the feature dimensionality reduction method. To improve the accuracy of welding quality prediction, the prediction model is constructed by fusing the adaptive simulated annealing, the particle swarm optimization, and the back propagation neural network algorithms. Finally, the effectiveness of the prediction method is verified through 74 sets of plate welding experiments, the prediction accuracy reaches over 90%.

Similar content being viewed by others

case study method data collection

Prediction of line heating deformation on sheet metal based on an ISSA–ELM model

case study method data collection

Quality prediction and classification of resistance spot weld using artificial neural network with open-sourced, self-executable and GUI-based application tool Q-Check

case study method data collection

Optimization of the ultrasonic roll extrusion process parameters based on the SPEA2SDE algorithm

Introduction.

The shipbuilding industry is a comprehensive national high-end equipment manufacturing industry that supports the shipping industry, marine development, and national defense construction. It plays a critical role in guaranteeing national defense strength and economic development 1 , 2 . With the continuous development of a new generation of information technology based on big data, internet of things (IoT), 5G, cloud computing, artificial intelligence, and digital twin, the intelligent construction is becoming the dominant advanced mode in the shipbuilding industry. At the same time, welding quality control is regarded as a significant part in shipbuilding, and the related innovation research under intelligent welding also urgently needs to be carry out. The current welding processing gradually becoming more flexible and complicated, the welding quality of each workstation ultimately determines the majority of the product quality by means of propagation, accumulation and interaction.

Welding process is one of the vital industrial technologies in ship segment construction 3 , 4 . However, in the welding process of ship components, the local uneven heating and local uncoordinated plastic strain of metal materials are most probably leading to large residual stresses 5 , 6 . This will cause a reduction in the static load capacity and fatigue strength of the ship components. which in turn affects the load capacity, dimensional accuracy, and assembly accuracy of the structure. However, in most shipbuilding enterprises, quality management usually involves issuing quality plans, post-sequence production inspections, and quality statistical reports, which are static quality control. The existing welding quality prediction methods have hypothetical premises and subjective factors, which cannot meet the dynamic control requirements of intelligent welding for processing quality. These methods often encounter problems such as inefficient quality inspection, untimely quality feedback, and untimely quality control 7 . Moreover, the post-welding correction process delays the ship construction cycle and increases the production cost.

The inadequacy of traditional welding quality control technology determines the functional and technical limitations in practical applications 8 , 9 . Firstly, the current welding process design relies on production experience and empirical calculation formulas 10 , which makes it difficult to ensure the design requirements for minimizing residual stress in the forming of structural parts. Secondly, the absence of effective data pre-processing methods to address complex production conditions and massive amounts of welding measurement data. Currently, the welding quality prediction methods for ship components is inadequate. For example, it is difficult to balance the prediction accuracy and computational efficiency, or combine actual measured welding data to drive data analysis services.

This work aims to provide a solution to the inefficiency of welding quality control during ship construction, delaying the production cycle and increasing production costs. The proposed method has the following advantages.

The data-acquisition framework for welding process parameters of ship unit-assembly welding is constructed, a stable and reliable data acquisition method is proposed.

Based on the feature selection method, the influence feature of the welding quality is quantitatively analyzed. This leads to the construction of an optimal subset of process influencing features for welding quality prediction.

Fusing an adaptive simulated annealing (SA), the particle swarm optimization (PSO) and the back propagation neural network (BPNN), a welding quality prediction model is established for welding quality control and decision making.

The remainder of this paper is organized as follows. “ Related works ” section presents the related research on the influence factor and prediction methods of welding quality. The data acquisition and processing framework is explained in “ Acquisition and pre-processing of welding process data ” section. In “ Construction the welding quality prediction model ” section, fusing an Adaptive SA, the PSO and the BPNN (APB), a welding quality prediction model is established. To verify the proposed method, the case study of ship unit-assembly welding is illustrated in “ Case study ” section. The conclusion and future work are shown in “ Conclusion and future works ” section.

Related works

Method for selecting welding quality features.

For huge amount of data in the production site, the knowledge information can be mined through suitable processing methods to assist production 11 , 12 . Feature selection is an important and widely used technique in this field. The purpose of feature selection is to select a small subset of features from the original dataset based on certain evaluation criteria, which usually yields better performance, such as higher classification accuracy, lower computational cost and better model interpretability. As a practical method, feature selection has been widely used in many fields 13 , 14 , 15 .

Depending on how the evaluation is performed, feature selection methods can be distinguished as filter models, wrapper models, or hybrid models. Filter models evaluate and select a subset of features based on the general characteristics of the data without involving any learning model. On the other hand, the wrapper model uses a learning algorithm set in advance and uses its performance as an evaluation criterion. Compared to filter models, wrapper models are more accurate but computationally more expensive. Hybrid models use different evaluation criteria at different stages and combine the advantages of the first two methods.

Two versions of an ant colony optimization-based feature selection algorithm was proposed by Warren et al. 16 , which can effectively improve weld defect detection accuracy and weld defect type classification accuracy. An enhanced feature selection method combining the Relief-F algorithm with a convolutional neural network (CNN) was proposed by Jiang et al. 17 to improve the recognition accuracy of welding defect identification in the manufacturing process of large equipment. A hybrid fisher-based filter and wrapper-based feature selection algorithm was proposed by Zhang et al. 18 , which reduces the 41 feature parameters for weld defect monitoring during tungsten arc welding of aluminum alloy to 19. The computational effort is reduced and the modeling accuracy is improved. Abdel et al. 19 proposed a combination of two-phase mutation and gray wolf optimization algorithm in the literature to solve the wrapper-based feature selection problem. It was able to balance accuracy and efficiency in handling the classification task. To the effect of maintaining and improving the classification accuracy. Le et al. 20 introduced a stochastic privacy preserving machine learning algorithm. The Relief-F algorithm was used for feature selection and Random Forest is utilized for privacy preserving classification. The algorithm is prevented from overfitting and higher classification accuracy is obtained.

In general, a huge amount of measured welding data is generated during the welding of actual ship components because of the complex production conditions. Problems such as high computational cost, easy to fall into local optimum and premature convergence of the algorithm may exist in feature selection. To determine the essential welding process influencing factors. It is necessary to use a suitable feature selection method that facilitates reasonable parsimony in obtaining the best set of input data features. This maximizes the accuracy and computational efficiency while reducing the computational complexity of the prediction model.

Welding quality prediction method

As the new-generation of information technology becomes popular in the ship construction, process data from the manufacturing site can be collected. This contains a non-linear mapping relationship between welding process parameters and quality. Welding data monitoring, welding quality prediction and optimization decisions can be effectively implemented. Therefore, based on machine learning algorithms to achieve welding quality prediction has received wide attention from academia and industry.

Pal et al. predicted the welding quality by processing the current and voltage signals in the welding process 21 , 22 . Taking the process parameters and statistical parameters of the arc signal as input variables, the BPNN, and radial basis function network models are adopted to realize the prediction of welding quality. A Fatigue strength prediction method of ultra-high strength steel butt-welded joints was proposed by Nykanen 23 . A reinforcement-penetration collaborative prediction network model based on deep residual was designed by Lu et al. 24 to predict reinforcement and penetration depth quantitatively. A nugget quality prediction method of resistance spot welding on aluminum alloy based on structure-borne acoustic emission signals was proposed by Luo et al. 25 .

Along with the maturity of related theories such as machine learning and neural networks, the task of welding quality prediction is increasingly implemented by scholars using related techniques.

Artificial neural networks (ANN): In the automatic gas metal arc welding processes, the response surface methodology and ANN models are adopted by Shim et al. 26 predict the best welding parameters for a given weld bead geometry. Lei et al. 27 used a genetic algorithm to optimize the initialization weights and biases of the neural network. They proposed a multi-information fusion neural network to predict the geometric characteristics of the weld by combining the welding parameters and the morphological characteristics of the molten pool. Chaki et al. 28 proposed an integrated prediction model of ANN and non-dominated sorting genetic algorithm which was used to predict and optimize the quality characteristics during pulsed Nd:YAG laser cutting of aluminum alloys. The improved regression network was adopted by Wang et al. 29 and predicted the future molten pool image. CNN were used by Hartl et al. 30 to analyze process data in friction stir welding and predict the resulting quality of the weld surface. To predict the penetration of fillet welds, a penetration quality prediction of asymmetrical fillet root welding based on an optimized BPNN was proposed by Chang et al. 31 . A CNN-based back bead prediction model was proposed by Jin et al. 32 . The image data of the current welding change is acquired, and the CNN model is used to realize the welding shape prediction of the gas metal arc welding. Hu et al. 33 established an ANN optimized by a pigeon inspired algorithm to optimize the welding process parameters of ultrasonic-static shoulder-assisted stir friction welding (U-SSFSW), which led to a significant improvement in the tensile strength of the joints. Cruz et al. 34 presented a procedure for yielding a near-optimal ensemble of CNNs through an efficient search strategy based on an evolutionary algorithm. Able to weigh the predictive accuracy of forecasting models against calculated costs under actual production conditions on the shop floor.

Support vector machines (SVM): SVM using the radial kernel, boosting, and random forest techniques were adopted by Pereda et al. 35 The direct quality prediction in the resistance spot welding process is achieved. To improve the prediction ability to welding quality during high-power disk laser welding, the SVM model was adopted by Wang et al. 36 to predict the welding quality of the metal vapor plume. By collecting the real-time torque signal of the friction stirs welding process, Das et al. 37 used an SVM regression model to predict the ultimate tensile strength of the welded joint. A model of laser welding quality prediction based on different input parameters was established by Petkovic 38 . Yu et al. 39 proposed a real-time prediction method of welding penetration mode and depth based on two-dimensional visual characteristics of the weld pool.

Other prediction models: A neuro-fuzzy model for the prediction and classification of the defects in the fused zone was built by Casalino et al. 40 . Using laser beam welding process parameters as input variables, neural networks, and C-Means fuzzy clustering algorithms are used to classify and predict the welding defects of Ti6Al4V alloy. Rout et al. 41 proposed a hybrid method based on fuzzy regression of the particle swarm optimization to achieve and optimize the prediction of weld quality from both mechanical properties and weld geometry. Kim et al. 42 proposed a semantic resistance spot welding weldability prediction framework. The framework constructs a shareable weldability knowledge database based on the regression rules. A decision tree algorithm and regression tree are used to extract decision rules, and the nugget width of the case was successfully predicted. AbuShanab et al. 43 proposed a stochastic vector functional link prediction model optimized by the Hunger Games search algorithm to link the joint properties with the welding variables, introducing a new prediction model for stir friction welding of dissimilar polymer materials.

Scholars have given various feasible forecasting schemes for welding quality. However, there are still defects, such as the lack of generalization performance of the weld quality prediction algorithms, having a large number of assumptions and subjective factors, these methods cannot meet the dynamic control requirements of intelligent welding for processing quality. Secondly, most prediction models can only predict before or after work, and cannot meet the dynamic changes in the welding environment on site. Therefore, the crucial to improving the welding quality is to give accurately and timely prediction results.

Acquisition and pre-processing of welding process data

The welding quality prediction framework is proposed and shown in Fig.  1 (The clearer version is shown in Supplementary Figure S1 ). Firstly, the critical controllable quality indicators in the ship unit-assembly welding are determined, and the influencing factors are analyzed. Secondly, based on the IoT system, a data acquisition system for real-time monitoring and prediction of the welding quality of the ship unit-assembly welding is established. Collection and transmission of welding data are achieved. Then, a feature selection method is created to optimally select the key features of the welding quality data.Secondly, fusing an adaptive simulated annealing, the particle swarm optimization and the back propagation neural network, a welding quality prediction model is established for welding quality control and decision making. Finally, the welding experiments of ship unit-assembly welding as an example to verify the critical technologies in the paper.

figure 1

The framework of welding quality prediction method.

Analyze the factors affecting welding quality

The reasons that lead to the welding quality problems of the ship component of ships involve six significant factors: human factors, welding equipment, materials, welding process, measurement system, and production environment. Residual stresses caused by instability during the welding of ship components are inextricably linked to the welding method and process parameters used. However, the essential factors are mainly determined by the thermal welding process and the constrained conditions of the weldment during the welding process. The influencing factors of welding quality in the thermal welding process are mainly reflected in the welding heat source type and its power density \(W\) , the effective thermal efficiency \(P\) and linear energy \(Q\) of the welding process, the heat energy’s transfer method (such as heat conduction, convection, etc.) and the welding temperature field. The determinants of the welding temperature field include the nature of the heat source and welding parameters (such as welding current, arc voltage, gas flow, inductance, welding speed, heat input, etc.). When the arc voltage and welding current increase, the heat energy input to the weld and the melting amount of the welding wire will increase directly, which will affect the width and penetration of the weld. When the welding speed is too low, it will cause the heat concentration and the width of the molten pool to increase, resulting in defects such as burn-through and dimensional deformation. The restraint conditions refer to the restraint type and restraint degree of the welded structure. Its value is mainly determined by factors such as the structure of the welded sheet, the position of the weld, the welding direction and sequence, the shrinkage of other parts during the cooling process, and the tightness of the clamped part.

Take the CO2 gas shielded welding process of the ship component as an example. Welding parameters determine the energy input to the weld and to a large extent affect the formation of the weld. Important process parameters that determine the welding quality of thin plate structures of ships include arc voltage, welding current, welding speed, inductance and gas flow. For example, when the welding current is too large, the weld width, penetration, and reinforcement that determine the dimension of the weld will increase, and welding defects are likely to occur during the welding process. At the same time, the angular deformation and bending deflection deformation of the welded sheet also increase. The instability and disturbance of the melt pool and arc can be caused when the gas flow rate is too high, resulting in turbulence and spatter in the melt pool.

Obtain the welding process data

The collection and transmission of process parameters during the welding process is an important basis for supporting the real-time prediction of welding quality. Therefore, a welding data acquisition and processing framework for ship component based on the IoT system is proposed, which is mainly divided into three levels: data perception, data transmission and preprocessing, and application services, as shown in Fig.  2 .

figure 2

A welding process data acquisition and processing framework.

During the execution of the welding process, the data sensing layer is mainly responsible for collecting various multi-source heterogeneous data in real-time and accurately, and providing a stable original data source for the data integration and analysis phase. The sensing data types mainly include welding process parameters, operating status information of welding equipment, and welding quality indicators. The collection method can be used through interface and protocol conversion or by connecting to an external intelligent sensing device. For example, for some non-digital smart devices, data collection can be operated by analog signals of electrical circuits. Then, data such as current, voltage, and welding speed are collected from the welding equipment by installing sensors such as current, voltage, and speed. Finally, a data acquisition board, such as PCL-818L, is used for analog-to-digital conversion, summary fusion, and data transmission. For most digital intelligent devices, data collection can use various communication interfaces or serial ports, PLC networks or communication interfaces, and other methods to collect and summarize the execution parameters and operating status of the equipment. Then, through the corresponding communication protocol, such as OPC-UA, MQTT, etc., the data read and write operations among the application, the server, and the PLC are realized.

The data transmission layer is mainly responsible for transmitting multi-source heterogeneous welding data collected on-site, achieving interconnectivity between underlying devices, application services, and multiple databases. As the new generation of communication technology matures, there are many ways to choose from for industrial-level information communication, such as 5G, Zigbee, industrial WiFi networks, and industrial Ethernet. According to actual needs and complementary advantages, a combined communication scheme can also be formed to meet the requirements of transmission and anti-interference ability, communication speed, and stability. The application scene and system functional requirements of this study are taken into account. Choose the combination of wired communication technology and wireless communication technology applications. To achieve efficient deployment of communication networks, with real-time welding data fast and stable transmission and portable networking.

The diversity of equipment in shipbuilding workshops and the heterogeneity of application systems have caused data to have multi-source and heterogeneous characteristics. Therefore, data integration is to shield the differences in data types and structures to realize unified storage, management, and data analysis. The key technologies of data integration include data storage and management, as well as data preprocessing. Among them, data storage management is the basis for maximizing data value and data preprocessing. Standard database technologies include SQL databases, such as MySQL, Oracle, etc., Redis, HBase, and other types of NoSQL databases. The specific deployment can be mixed and used according to actual needs and application scenarios to achieve complementary advantages and maximize benefits.

Data feature selection

Data feature selection is the premise to ensure the quality of data analysis and mining. It can not only ensure the quality and uniform format of the perceived data set, but also effectively avoid the feature jumble and curse of dimensionality in the process of data analysis. The welding data collected on-site will inevitably have characteristics such as missing, non-standard, and large capacity, requiring data filtering, data recovery, and data conversion to improve data quality and unify data formats.

The Relief-F algorithm is obtained by extending the function of the Relief algorithm by I. Kononenko 44 . It is a feature weight algorithm. Its function is to assign different weights to feature quantities based on the correlation between each feature quantity and category. Remove feature quantities with weights less than a certain threshold based on the calculation results. To achieve optimization of feature quantities. For multi-classification problems, suppose that the single-label training data \(D=\{\left({x}_{1},{y}_{1}\right),\left({x}_{2},{y}_{2}\right),\dots ,({x}_{n},{y}_{n})\}\) set can be divided into \(|c|\) categories. Relief-F can find the nearest neighbor examples in the sample set of class \({K}_{j}({K}_{j}\in ,\{\mathrm{1,2},\dots ,|c|\})\) and each other class for the example \({X}_{i}\) belonging to class \({K}_{j}\) . Suppose that the near-hit examples of \({X}_{i}\) is \({X}_{i,l,nh}\) ( \(l=\mathrm{1,2},\dots ,\left|c\right|;l\ne {K}_{j}\) ) and the near-miss examples of \({X}_{i}\) is \({X}_{i,l,nm}\) . Then, the iterative calculation formula is used to update the feature weight \(W(A)\) of the attribute feature A. According to the input data set \(D\) , set the sampling times of the sample to \(m\) , the threshold of the feature weight to \(\delta\) , and the number of nearest neighbor samples to \(k\) , and the corresponding calculation description is as follows:

The feature weight \(W(A)\) of each attribute is initialized to 0, and the feature weight set \(T\) of the sample data set \(D\) is an empty set.

Starting iterative calculation, and randomly selecting example \({X}_{i}\) from the sample data set \(D\) .

From the sample set \(D\) of the same type as \({X}_{i}\) , finding \(k\) the near-hit examples \({X}_{i,l,nh}\) , denoted as \({H}_{i}(c)(i=\mathrm{1,2},\dots ,k,c=class({X}_{i}))\) . From the sample set \(D\) of the same different type as \({X}_{i}\) , finding \(k\) the near-miss examples \({X}_{i,l,nm}\) , denoted as \({M}_{i}(\widehat{c})(\widehat{c}\ne class({X}_{i}))\) .

Updating the feature weights \(W(A)\) and \(T\) , the calculation formulas are as follows:

where \(diff\left(A,{X}_{1},{X}_{2}\right)\) represents the distance between the sample \({X}_{1}\) and \({X}_{2}\) on the feature \(A\) . \(class\left({X}_{i}\right)\) represents the class label contained in sample \({X}_{i}\) . \(P\left(c\right)\) represents the prior probability of the result label c.

According to the weight calculation results of each attribute, the feature set of the initial input data is filtered reasonably. Specifically, a threshold \(\tau\) needs to be specified, and the setting principle of its value should conform to Chebyshev's inequality \(0<\tau \ll 1/\sqrt{\alpha m}\) , \(a\) is the probability of accepting irrelevant features and \(m\) is the number of welding data samples.

Construction the welding quality prediction model

The welding quality prediction model based on apb.

The BPNN is the most successful learning algorithm for training multi-layer feedforward neural networks. Mathematically express the principle of iterative computation of BP neural network 45 . Assume that the sample dataset \(D=\left\{\left({x}_{1}{,y}_{1}\right),\left({x}_{2}{,y}_{2}\right),\dots ,\left({x}_{n}{,y}_{n}\right)\right\},{x}_{i}\in {R}^{m},{y}_{i}\in {R}^{z})\) , where the input sample vector includes m feature attributes and outputs a z -dimensional real-valued vector. m input neural nodes, q hidden layer neural nodes and z output neural nodes form a classical error BPNN structure. Taking the three-layer multilayer feedforward network structure as an example. The threshold value of the h -th neural node in the hidden layer is \({\gamma }_{h}\) . Threshold value of the j- th neural node in the output layer be \({\theta }_{j}\) . Connection weights between the i -th neural node in the input layer and the h -th neural node in the hidden layer are denoted as \({v}_{ih}\) . Connection weights between the h- th neural node in the hidden layer and the j -th neural node in the output layer are denoted as \({\omega }_{hj}\) . Notate that k is the number of training iterations of the network model.

The input vectors of each neural node in the hidden layer can be computed through the threshold \({\gamma }_{h}\) and the connection weights \({v}_{ih}\) between the input layer and the hidden layer \({O}_{h}\) . d is then used to generate the output vectors f of each neural node in the hidden layer by calculating through the activation function e.

The input vector \({O}_{h}\) of each neural node in the hidden layer can be calculated by the threshold \({\gamma }_{h}\) and the connection weight \({v}_{ih}\) between the input layer and the hidden layer. The output vectors \({S}_{h}\) of each neural node in the hidden layer are then computed by using \({O}_{h}\) through the activation function \(L(x)\) to generate the output vectors f of each neural node in the hidden layer:

Then, by utilizing the output vectors of the implicit layer, the connection weights \({\omega }_{hj}\) and the threshold \({\theta }_{j}\) , the input vector \({\beta }_{j}\) of each neural node in the output layer can be calculated. In employing the input vectors \({\beta }_{j}\) by means of the activation function \(p(x)\) the output response vectors \({T}_{j}\) of each neural node in the output layer can be computed:

For training sample \(\left({x}_{i}{,y}_{i}\right)\) , the output vector of the error back-propagation neural network is assumed to be \({T}_{j}\) . That is, the mean square error \({E}_{i}\) between the actual output value \({T}_{i}\) and the expected output value \({y}_{i}\) of the input training sample \(\left({x}_{i}{,y}_{i}\right)\) can be calculated as:

The BP neural network is an iterative learning algorithm. Based on the gradient descent strategy in each round of iteration for any parameter \(\delta\) the update formula is:

The learning rate is given as \(\eta\) , and the formula is derived in terms of the incremental weight \(\Delta {V}_{ih}\) of the connection between the input and hidden layers. Consider that \({V}_{ih}\) successively affects the input and output vectors of the h -th neural node of the hidden layer. Then it affects the input and output vectors of the j -th neural node of the output layer. Finally, it affects \({E}_{i}\) . That is:

It is assumed that a typical Sigmoid function is used for both hidden and output layer neural element nodes. That is, there is a characteristic function formula relationship. That is:

Substituting into Eq. (9), the update equation for a can be solved. Similarly, updated formulas for \(\Delta {\omega }_{hj}\) , \(\Delta {\theta }_{j}\) , and \(\Delta {\gamma }_{h}\) can be obtained. That is:

The BPNN model can realize any complex mapping of multidimensional and nonlinear functions, but it is easy to fall into the optimal local solution. The particle swarm optimization is a global random search algorithm based on swarm intelligence. It has good global search performance and universality for solving the global optimal solution of multiple objective functions and constraints. It can improve the convergence accuracy of BPNN and improve prediction performance. Therefore, fusing an adaptive simulated annealing, the particle swarm optimization and the back propagation neural network, a welding quality prediction algorithm is created. The algorithm flow is shown in Fig.  3 .

figure 3

The APB algorithm flow.

During the iteration optimization of the algorithm, the particle updates its position by tracking the individual extremes of the particle itself and the global extremes of the population. The movement of particles is composed of three parts, which reflect the trend of maintaining the previous speed, approaching the best position in history, group cooperation, and information sharing. The updated formulas of particle velocity and function are as follows:

where the critical parameters of each part are: \(w\) is the inertia weight coefficient. \({c}_{1}\) and \({c}_{2}\) are self-cognitive factors and social cognitive factors, respectively. \({v}_{i}(k)\) and \({x}_{i}\left(k\right)\) , respectively represent the velocity and position of the particle \(i\) at the k-th iteration. \({r}_{1}\) , \({r}_{2}\) are uniform random numbers in the range of \([\mathrm{0,1}]\) . \({P}_{best.i}\left(k\right)\) and \({G}_{best}(k)\) represent the individual optimal solution and the optimal global solution of the particle \(i\) at the k-th iteration.

\(w\) , \({c}_{1}\) and \({c}_{2}\) are essential parameters for controlling the iteration of the particle swarm optimization algorithm (PSO). \(w\) contains the inertia of the particle flight and the strength of the algorithm's searchability. \({c}_{1}\) , \({c}_{2}\) directly affect the particle's motion bias toward individual or group optimal. Therefore, to realize the adaptability of PSO, this study dynamically adjusts \(w\) and \({c}_{1}\) , \({c}_{2}\) to control the local and global optimization search strategy and collaborative sharing ability of the algorithm during iterative calculation. The nonlinear control strategy of a negative double tangent curve is adopted to control the change of \(w\) . The values of \({c}_{1}\) and \({c}_{2}\) vary with the iterative times \(k\) of PSO. The updated formulas of related parameters are as follows:

where \({w}_{max}\) and \({w}_{min}\) are the maximum and minimum values of the inertia weight coefficient. \(k\) is the current number of iterations. \({k}_{max}\) is the maximum number of iterations. \({c}_{1max}\) , \({c}_{1min}\) are the maximum and minimum values of the self-cognitive factor. \({c}_{2max}\) , \({c}_{2min}\) are the maximum and minimum values of the social cognitive factor.

In addition, to improve the search dispersion of the PSO algorithm and avoid convergence to local minima, SA is applied to the cyclic solution process of the PSO algorithm. The SA algorithm is an adaptive iterative heuristic probabilistic search algorithm. It has strong robustness, global convergence, computational parallelism, and adaptability, and can be suitable for solving nonlinear problems, as well as solving different types of design variable optimization problems. The specific process of the algorithm is as follows:

Select welding quality influencing factors with strong correlation as input characteristic set and corresponding welding quality data as output attribute set to establish training data set and verification data set of algorithm;

Preliminary construction of the BPNN prediction model for welding quality prediction;

The suitability function is set as the mean square error calculation function to evaluate the predictive performance. The flying particles are the weights and threshold parameter matrices of each neural network node. Particle population size N and maximum evolution number M are initialized. Set the search space dimension and speed range of the particle swarm. Random updating of the positions and velocities of all particles in the population;

Calculate the fitness values of all initial particles in the population, compare the optimal position of particle individual \({{\text{P}}}_{best}\) with the optimal position of population \({{\text{G}}}_{best}\) , and set the initial temperature of simulated annealing algorithm \(T\left(0\right)\) according to formula ( 22 );

Update the position and velocity of the particles by adjusting w, \({c}_{1}\) , and \({c}_{2}\) adaptively according to formulas ( 18 ), ( 19 ), and ( 20 ). Perform an iterative optimization. Update the global optimum of the population;

Set \(T=T(0)\) and initial solution \({S}_{1}\) as the global optimal solution, and determine the number of iterations at each temperature T, denoted as the chain length L of the Metropolis algorithm;

A stochastic perturbation is added to solution \({S}_{1}\) of the wheeled iteration and a new solution \({S}_{2}\) is generated.

Calculate the increment \(df=f{(S}_{2}\) ) \(-f({S}_{1})\) of the new solution \({S}_{2}\) , where \(f(x)\) is the fitness function;

If \(df<0\) , \({S}_{2}\) is accepted as the current solution for the iteration of the current wheel, so \({{S}_{1}=S}_{2}\) . If \(df>0\) , then the acceptance probability \({\text{exp}}(df/T)\) of \({S}_{2}\) is calculated, i.e. the random number Rand with uniform distribution is randomly generated in the interval (0,1). When the acceptance probability \({\text{exp}}(df/T)>rand\) , \({S}_{2}\) is also accepted as the new solution for the iteration, otherwise the current solution \({S}_{1}\) is retained;

The predictive error of the current solution \({S}_{1}\) has reached the accuracy requirement, or the number of iterations of the algorithm reaches the maximum number of iterations M, and the algorithm terminates. Otherwise, the algorithm decays the current temperature T according to the set attenuation function and returns to step 5 for cycle iteration until the condition is met and the current global optimal solution is output;

Output the current optimal particle, i.e. the optimal threshold and weight vector, fit the validation sample set and calculate the forecast error, and return to step 5 if conditions are not met.

After each iteration, the algorithm simulates the linear decay process of the initial temperature \(T\left(0\right).\) Then, the algorithm can not only accept the optimal solution, but also accept a certain probability \({P}_{T}\) , which improves the ability of PSO to jump out of the optimal local solution in the iterative optimization process. The updated formulas of related parameters are as follows:

where \({X}_{i+1}^{T(k)}\) represents the individual solution at the current temperature \(T\left(k\right)\) . \({P}_{T\left(k\right)}(i)\) is an acceptable probability that the new solution \({X}_{i+1}^{T(k)}\) can replace the historical solution \({X}_{i}^{T(k)}\) . \(T\left(k\right)\) represents the current temperature of the k-th annealing. \(\mu\) represents the cooling coefficient.

To evaluate the prediction accuracy of the improved algorithm model, the coefficient of determination (R 2 ), the mean absolute percentage error (MAPE), and the root mean square error (RMSE) is selected as a predictor of error in this thesis. The specific reference formula is:

where n is the sample size in the dataset; \({y}_{i}\) is the actual observation corresponding to the ith sample instance; \({\widehat{y}}_{i}\) is the fitted prediction corresponding to the ith sample instance; and \(\overline{y }\) is the average observation of the n sample instances. \({y}^{(i)}\) is the actual value corresponding to the i-th instances. \(h({X}^{(i)})\) is the predicted value corresponding to the i-th instances.

The R 2 indicates the superiority of fitting the covariance between the sample independent variables and the dependent variable in the regression model. The MAPE and RMSE reflect the degree of deviation between the predicted and actual values.

  • Process parameter optimization

The genetic algorithm is first proposed by John Holland 46 according to the evolution law of biological populations in nature. It is an algorithm to obtain the optimal solution by simulating the natural evolution of the biological population. It can handle complex nonlinear combinatorial optimization problems and has a good global optimization-seeking ability, so genetic algorithm is widely used in optimization problems in many engineering fields. Based on the welding quality prediction model built in the previous chapter, the genetic algorithm is introduced to optimize the welding process parameters to obtain the optimal combination of process parameters.

The specific idea is the welding current, arc voltage, welding speed, wire elongation, welding gas flow, and inductance of each process parameter by the actual number encoding as a gene composition chromosome. A chromosome represents a set of welding process parameters combined, to obtain the initial population, and then through selection, crossover, mutation to generate new populations, and according to the above-established prediction model for the degree of adaptation function for evaluation, and then finally iterate through the genetic algorithm to get the optimal combination of process parameters. The specific algorithm flow is shown in Fig.  4 .

figure 4

The optimization process of welding process parameters.

To demonstrate the feasibility of the method proposed in this paper, welding experiments on ship unit components are conducted in cooperation with a large shipyard. The proposed method is verified to accurately predict the welding quality for ship unit-assembly welding.

Based on the industrial IoT framework of data acquisition and processing of ship component welding, The welding data collection method is validated, as shown in Fig.  5 (The clearer version is shown in Supplementary Figure S2 ). The ship plate used in the investigation is general strength hull structural steel-Q235B. Its specific size is 300 mm × 150 mm × 5 mm, and its welding process chooses the center surfacing welding of the ship component. In the welding experiment of the ship component structure, the digital welding machine selected is Panasonic's all-digital Metal Inert Gas welding machine, model YD-350GS5 of the GP5 series, which has a built-in IoT module and analog communication interface. The automatic welding robot uses a Panasonic TAWERS welding robot, which can realize very low spatter welding of ship components. To collect key welding process parameters, some intelligent sensors and precision measuring instruments are equipped in the experiment. The threading sensor of CO2 welding can monitor the elongation of welding wire during welding. TH2810 inductance measuring instrument is used to measure the inductance during welding. In addition, the mass flow controller of shielding gas is used to measure the welding gas flow in the welding process (More complete description is shown in Supplementary Table S1 ).

figure 5

The welding data acquisition system of the ship component structure.

The equipment used for welding data transmission includes communication interface equipment and a serial port network module. For digital welding machines and welding robots, PLC provides analog input modules that can receive standard voltage or current signals converted by transmitters. Then, after the calculation and analysis of PLC, the analyzed data can be displayed on the human–machine interface (HMI) of the welding site through the communication interface device and communication protocol. The intelligent sensor configured in the experiment has its communication interface, such as RS232 and RS485. Therefore, the serial port network module can establish a connection and data protocol conversion with each sensor. Wireless Fidelity transmission and Ethernet can be set through a radiofrequency (RF) antenna and WAN/LAN conversion component to support the welding data reading and writing operation between welding site and the upper computer. In this case, MySQL database is used to store, manage and share welding data.

Residual stresses are measured by the blind hole method on the finished welded steel plate. The value of the residual stress reflects the quality of the weld. The blind hole method is based on applying a strain gauge to the surface of the workpiece to be measured. Then a hole is punched into the workpiece to cause stress relaxation around the hole and generate a new stress field distribution. Strain release is collected by the strain gauge, and the original residual stress and strain of the workpiece can be deduced based on the principle of elasticity.

The measurement equipment consisted of a stepper-adjustable drilling machine, a three-phase strain gauge, a TST3822E static strain test analyzer, and software. The diameter of the blind hole is 2 mm, and the depth of the hole is 3 mm. The measured stress is the average value of the pressure distribution in the depth of the blind spot. According to the direction of action, the residual welding stresses are divided into longitudinal residual stresses parallel to the weld axis and transverse residual stresses perpendicular to the weld. In this experiment, the strain gauge type is chosen as a three-phase right-angle strain gauge. That is, the layout angles of the strain gauges are 0°, 45°, and 90°. Longitudinal strain, principal strain, and transverse strain are measured, respectively. Since the distribution of longitudinal residual stresses is more regular than that of transverse residual stresses, only the themes in the 0° direction are considered in this experiment. The amount of strain changes along the weld direction, and then the analysis software yields the longitudinal residual stress. As shown in Fig.  6 , the operation site and the sticking position of the strain gauge for the experiment using the blind hole method are used. The participating stresses of each plate are collected through the TST3822E static strain test analyzer and computer software.

figure 6

Blind hole method to collect residual stress.

Preprocessing the welding process data

According to the correlation between the collected welding data and weld formation quality, MATLAB software and the Relief-F algorithm assign different influence weights to each data feature. Data features whose weight is less than the threshold value, such as data types irrelevant or weakly related to the weld formation quality, these data features will be excluded. The collected data includes welding current, arc voltage, welding speed, welding gun angle, steel plate thickness, welding gas flow, welding wire diameter, inductance value, and welding wire elongation. The Relief-F algorithm needs to set the number of neighbors and sampling times. Combined with the experimental sample data collected in the experiment, this case selects the number of neighbors \(k=\mathrm{3,5},\mathrm{7,8},9,\mathrm{10,15,18,25}\) , and the number of sampling \(m\) is 80. The calculation results are shown in Fig.  7 . The average of the calculation results of each group is used as the final weight of each data feature, and the calculation result is shown in Table 1 .

figure 7

The final weight of each data feature.

Among the features of the collected welding data, the influence weights of arc voltage and welding current on the quality of weld formation are the largest, which are 0.254 and 0.232, respectively. The main reason is that when the arc voltage and welding current increase, it will directly cause the heat energy input to the weld seam and the melting amount of the welding wire to increase, thereby increasing the width, penetration, and reinforcement of the weld seam. Secondly, the data feature with a relatively small degree of influence is the welding speed, and its corresponding influence weight is 0.173. When the welding speed is too high, the cooling rate of the welding seam will be too fast. Then, it will lead to deposition and the reduction of the number of metal coatings, which will affect the quality of weld formation. On the contrary, it will cause the heat concentration and width of the molten pool to increase, resulting in burn-through and other welding defects. In addition, in CO2 gas-shielded welding, the welding gas flow rate is a key parameter that affects the quality of weld formation, and its calculated influence weight is 0.171. When the gas flow is too large, it will cause instability and disturbance of the molten pool and arc of the weld, resulting in turbulence and splashing in the molten pool. On the contrary, it will directly reduce the protective effect of gas and affect the quality of weld formation. The inductance value will affect the penetration of the weld, and its weight is calculated to be 0.16. The welding wire elongation will directly affect the protective effect of the gas, and its weight is estimated to be 0.144. The welding gun angle, steel plate thickness, and welding wire diameter also have a particular influence on the forming quality of the weld. The influence weights are calculated to be 0.13, 0.08, and 0.05, respectively.

In this verification case, the data sample size for CO2 gas-shielded welding of the ship component structure is 350, and \(\alpha\) is 0.145. It is calculated that the weight threshold range of the influence weight of the weld forming quality in the CO2 gas-shielded welding of the ship component structure is \(0<\tau \le 0.14\) . Combined with the calculation results of the data feature weight, the influencing factors whose influence weight is greater than the threshold value are considered the main process parameters in this experiment. The main process parameters are arc voltage, welding current, welding speed, welding gas flow, inductance value, and welding wire extension length. These will be used as key input variables for constructing a welding quality prediction model.

Predict the welding quality

Using MATLAB as the verification platform, this case uses the APB algorithm model to predict the welding quality of the ship component. 300 sets of welding data are selected to train the algorithm model (Complete data in Supplementary Table S2 ), and 74 sets are selected for verification. The verification data set is shown in Table 2 (Complete data in Supplementary Table S3 ). This case considers the weld forming coefficient as the target variable and selects six variables as the key welding quality influencing factors according to the feature selection results in “ Preprocessing the welding process data ” section. The six key welding quality influencing factors include welding current, arc voltage, welding speed, welding wire elongation, inductance value, and welding gas flow.

After conducting many experiments using the above welding data, the upper limit \({w}_{max}\) is set to 0.9, and the lower limit \({w}_{min}\) is set to 0.4. The maximum number of \({k}_{max}\) iterations of PSO is set to 1000. The parameter combination of the self-cognitive factor and social cognitive factor is that \({c}_{1max},{c}_{1min},{c}_{2max}\) , and \({c}_{2min}\) are set to 2.5, 1.25, 2.5, and 1.25, respectively. The APB algorithm's global search capability and convergence speed can be balanced and achieve better results. The Metropolis criterion of the SA algorithm is introduced into the iterative calculation of the PSO algorithm. In the case verification, the initial temperature ( \(T\left(0\right)={10}^{4}\) ) is attenuated by the cooling coefficient ( \(\mu =0.9\) ). The 24 sets of welding data in Table 2 are substituted into the trained APB algorithm model to predict and verify the forming weld coefficient. The actual output value of each verification sample is compared with the expected value, and the relative error is calculated, as shown in Table 3 . (Complete data in Supplementary Table S4 ). In this case, the maximum and minimum relative prediction errors of the SAPSO_BPNN algorithm model on the validation sample data set are 8.764% and 5.364%, respectively. In general, the error of the proposed prediction algorithm is relatively small and can satisfy the accuracy requirements for predicting the welding quality of ship components of ships.

In addition, the improvements and advantages of the proposed APB algorithm model are further explained. Using the same welding data set above, the BPNN, BPNN optimized method based on the particle swarm optimization algorithm (PSO-BP), and the APB algorithm are selected to predict the residual welding stress. Some data comparison results are shown in Fig.  8 .

figure 8

The predictive outputs and comparison result of different algorithms.

The calculation results of the algorithm evaluation indicators R2, MAPE and RSME are shown in Table 4 . In comparison, the prediction accuracy of the welding data samples using the PSO-BP algorithm is higher than that of the BPNN. In addition, the prediction accuracy based on the APB algorithm is also significantly improved compared to PSO-BP.

Optimize the welding process parameters

Several workpieces with high welding residual stress are found in the experiment. The quality of these workpieces is not up to requirements, resulting in scrap. It will bring unnecessary economic loss to the enterprise. To reduce the loss and improve efficiency, the unqualified variety of welding process parameters is optimized by using the global optimization ability of the genetic algorithm.

Relevant parameters of the genetic algorithm are selected: maximum evolutionary algebra, population size, crossover probability, and variation probability are 100, 50, 0.7, and 0.01, respectively. The proposed forecast model is used as the objective function. The smaller the residual stress value, the higher the suitability. The experiment is carried out again to optimize the defective combination of process parameters in real time. The residual stress of the optimized product is re-measured. The results are shown in Table 5 . The experimental results show that the optimized combination of process parameters can yield products with lower residual stress. Improve quality and reduce economic losses. It can provide a reference for the real-time improvement of the welding process in enterprises.

Conclusion and future works

To meet the requirements of real-time monitoring and accurately prediction the ship unit-assembly welding quality, an IOT-based welding data acquisition framework is firstly established. And the stable and reliable data is obtained. The welding process monitoring elements are determined based on the feature dimensionality reduction methods. According to the Relief-F algorithm, the crucial features data is selected among the historical dataset. Secondly, the correlation rule between process parameters and welding quality is established. The prediction model of the ship unit-assembly welding is constructed by fusing the adaptive simulated annealing, the particle swarm optimization and the back propagation neural network. In order to optimize the welding parameters, the genetic algorithm is selected. Finally, the experimental welding of ship component is used as an example to verify the effectiveness of the proposed critical techniques.

The experimental results show that the proposed APB prediction model can predict the welding characteristics more effectively than the traditional methods, with a prediction accuracy of more than 91.236%, the coefficient of determination (R 2 ) is increased from 0.659 to 0.952, the mean absolute percentage error (MAPE) is reduced from 39.83 to 1.77%, and the root mean square error (RMSE) is reduced from 0.4933 to 0.0709. Showing higher prediction accuracy. It is proved that the technique can be used for online monitoring and accurate prediction of the welding quality for ship components. It can realize real-time collection and efficient transmission of big welding data, including welding process parameters, information on the operating status of welding equipment, and welding quality indicators. In addition, with the support of new-generation information technology such as the IoT, Big data, etc. the dynamic quality data in the welding process can be tracked in real-time and fully explored to realize online monitoring and accurate prediction of welding quality. With the application and development of automated welding equipment, more welding quality data and its impact factors are obtained. With the continuous updating and mining of welding data, a more accurate prediction model of welding quality needs to be established.

To dynamic control the processing quality of the ship unit-assembly welding, the proposed method can be well carried out. However, the implementation of technology is limited by the diversity and complexity of the ship sections assembly-welding process, so more effort and innovation should be paid to solve these defects. It is necessary to improve the perception and management of real-time data in the IoT system, so as to promote the deep fusion of physical and virtual workshops, and establish a more reliable virtual model and multi-dimensional welding simulation. Meanwhile, with the support of a more complete real-time database and welding quality mapping mechanism, the ship welding quality analysis ability can be continuously enhanced, and the processing quality prediction method can be further improved and innovated.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Stanic, V., Hadjina, M., Fafandjel, N. & Matulja, T. Toward shipbuilding 4.0—An Industry 4.0 changing the face of the shipbuilding industry. Brodogradnja 69 , 111–128. https://doi.org/10.21278/brod69307 (2018).

Article   Google Scholar  

Ang, J., Goh, C., Saldivar, A. & Li, Y. Energy-efficient through-life smart design, manufacturing and operation of ships in an Industry 4.0 environment. Energies 10 , 610. https://doi.org/10.3390/en10050610 (2017).

Remes, H. & Fricke, W. Influencing factors on fatigue strength of welded thin plates based on structural stress assessment. Weld. World 6 , 915–923. https://doi.org/10.1007/s40194-014-0170-7 (2014).

Article   CAS   Google Scholar  

Remes, H. et al. Factors affecting the fatigue strength of thin-plates in large structures. Int. J. Fatigue 101 , 397–407. https://doi.org/10.1016/j.ijfatigue.2016.11.019 (2017).

Li, L., Liu, D., Ren, S., Zhou, H. & Zhou, J. Prediction of welding deformation and residual stress of a thin plate by improved support vector regression. Scanning 2021 , 1–10. https://doi.org/10.1155/2021/8892128 (2021).

Fricke, W. et al. Fatigue strength of laser-welded thin-plate ship structures based on nominal and structural hot-spot stress approach. Ships Offshore Struct. 10 , 39–44. https://doi.org/10.1080/17445302.2013.850208 (2015).

Li, L., Liu, D., Liu, J., Zhou, H. & Zhou, J. Quality prediction and control of assembly and welding process for ship group product based on digital twin. Scanning 2020 , 1–13. https://doi.org/10.1155/2020/3758730 (2020).

Franciosa, P., Sokolov, M., Sinha, S., Sun, T. & Ceglarek, D. Deep learning enhanced digital twin for closed-loop in-process quality improvement. CIRP Ann. 69 , 369–372. https://doi.org/10.1016/j.cirp.2020.04.110 (2020).

Febriani, R. A., Park, H.-S. & Lee, C.-M. An approach for designing a platform of smart welding station system. Int. J. Adv. Manuf. Technol. 106 , 3437–3450. https://doi.org/10.1007/s00170-019-04808-6 (2020).

Liu, J. et al. Digital twin-enabled machining process modeling. Adv. Eng. Inf. 54 , 101737. https://doi.org/10.1016/j.aei.2022.101737 (2022).

Liu, J. et al. A digital twin-driven approach towards traceability and dynamic control for processing quality. Adv. Eng. Inf. 50 , 101395. https://doi.org/10.1016/j.aei.2021.101395 (2021).

Chen, J., Wang, T., Gao, X. & Wei, L. Real-time monitoring of high-power disk laser welding based on support vector machine. Comput. Ind. 94 , 75–81. https://doi.org/10.1016/j.compind.2017.10.003 (2018).

Rauber, T. W., De Assis Boldt, F. & Varejao, F. M. Heterogeneous feature models and feature selection applied to bearing fault diagnosis. IEEE Trans. Ind. Electron. 62 , 637–646. https://doi.org/10.1109/TIE.2014.2327589 (2015).

Bahmanyar, A. R. & Karami, A. Power system voltage stability monitoring using artificial neural networks with a reduced set of inputs. Int. J. Electr. Power Energy Syst. 58 , 246–256. https://doi.org/10.1016/j.ijepes.2014.01.019 (2014).

Rostami, M., Berahmand, K., Nasiri, E. & Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 100 , 104210. https://doi.org/10.1016/j.engappai.2021.104210 (2021).

Liao, T. W. Improving the accuracy of computer-aided radiographic weld inspection by feature selection. NDT E Int. 42 , 229–239. https://doi.org/10.1016/j.ndteint.2008.11.002 (2009).

Jiang, H. et al. Convolution neural network model with improved pooling strategy and feature selection for weld defect recognition. Weld. World 65 , 731–744. https://doi.org/10.1007/s40194-020-01027-6 (2021).

Zhang, Z. et al. Multisensor-based real-time quality monitoring by means of feature extraction, selection and modeling for Al alloy in arc welding. Mech. Syst. Signal Process. 60–61 , 151–165. https://doi.org/10.1016/j.ymssp.2014.12.021 (2015).

Article   ADS   Google Scholar  

Abdel-Basset, M., El-Shahat, D., El-henawy, I., de Albuquerque, V. H. C. & Mirjalili, S. A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst. Appl. 139 , 112824. https://doi.org/10.1016/j.eswa.2019.112824 (2020).

Le, T. T. et al. Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests. Bioinformatics 33 , 2906–2913. https://doi.org/10.1093/bioinformatics/btx298 (2017).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Pal, S., Pal, S. K. & Samantaray, A. K. Neurowavelet packet analysis based on current signature for weld joint strength prediction in pulsed metal inert gas welding process. Sci. Technol. Weld. Join. 13 , 638–645. https://doi.org/10.1179/174329308X299986 (2008).

Pal, S., Pal, S. K. & Samantaray, A. K. Prediction of the quality of pulsed metal inert gas welding using statistical parameters of arc signals in artificial neural network. Int. J. Comput. Integr. Manuf. 23 , 453–465. https://doi.org/10.1080/09511921003667698 (2010).

Nykänen, T., Björk, T. & Laitinen, R. Fatigue strength prediction of ultra high strength steel butt-welded joints. Fatigue Fract. Eng. Mat. Struct. 36 , 469–482. https://doi.org/10.1111/ffe.12015 (2013).

Lu, J., Shi, Y., Bai, L., Zhao, Z. & Han, J. Collaborative and quantitative prediction for reinforcement and penetration depth of weld bead based on molten pool image and deep residual network. IEEE Access 8 , 126138–126148. https://doi.org/10.1109/ACCESS.2020.3007815 (2020).

Luo, Y., Li, J. L. & Wu, W. Nugget quality prediction of resistance spot welding on aluminium alloy based on structureborne acoustic emission signals. Sci. Technol. Weld. Join. 18 , 301–306. https://doi.org/10.1179/1362171812Y.0000000102 (2013).

Shim, J.-Y., Zhang, J.-W., Yoon, H.-Y., Kang, B.-Y. & Kim, I.-S. Prediction model for bead reinforcement area in automatic gas metal arc welding. Adv. Mech. Eng. 10 , 168781401878149. https://doi.org/10.1177/1687814018781492 (2018).

Lei, Z., Shen, J., Wang, Q. & Chen, Y. Real-time weld geometry prediction based on multi-information using neural network optimized by PCA and GA during thin-plate laser welding. J. Manuf. Process. 43 , 207–217. https://doi.org/10.1016/j.jmapro.2019.05.013 (2019).

Chaki, S., Bathe, R. N., Ghosal, S. & Padmanabham, G. Multi-objective optimisation of pulsed Nd:YAG laser cutting process using integrated ANN–NSGAII model. J. Intell. Manuf. 29 , 175–190. https://doi.org/10.1007/s10845-015-1100-2 (2018).

Wang, Y. et al. Weld reinforcement analysis based on long-term prediction of molten pool image in additive manufacturing. IEEE Access 8 , 69908–69918. https://doi.org/10.1109/ACCESS.2020.2986130 (2020).

Hartl, R., Praehofer, B. & Zaeh, M. Prediction of the surface quality of friction stir welds by the analysis of process data using artificial neural networks. Proc. Inst. Mech. Eng. Part L J. Mater. Des. Appl. 234 , 732–751. https://doi.org/10.1177/1464420719899685 (2020).

Chang, Y., Yue, J., Guo, R., Liu, W. & Li, L. Penetration quality prediction of asymmetrical fillet root welding based on optimized BP neural network. J. Manuf. Process. 50 , 247–254. https://doi.org/10.1016/j.jmapro.2019.12.022 (2020).

Jin, C., Shin, S., Yu, J. & Rhee, S. Prediction model for back-bead monitoring during gas metal arc welding using supervised deep learning. IEEE Access 8 , 224044–224058. https://doi.org/10.1109/ACCESS.2020.3041274 (2020).

Hu, W. et al. Improving the mechanical property of dissimilar Al/Mg hybrid friction stir welding joint by PIO-ANN. J. Mater. Sci. Technol. 53 , 41–52. https://doi.org/10.1016/j.jmst.2020.01.069 (2020).

Cruz, Y. J. et al. Ensemble of convolutional neural networks based on an evolutionary algorithm applied to an industrial welding process. Comput. Ind. 133 , 103530. https://doi.org/10.1016/j.compind.2021.103530 (2021).

Pereda, M., Santos, J. I., Martín, Ó. & Galán, J. M. Direct quality prediction in resistance spot welding process: Sensitivity, specificity and predictive accuracy comparative analysis. Sci. Technol. Weld. Join. 20 , 679–685. https://doi.org/10.1179/1362171815Y.0000000052 (2015).

Wang, T., Chen, J., Gao, X. & Li, W. Quality monitoring for laser welding based on high-speed photography and support vector machine. Appl. Sci. 7 , 299. https://doi.org/10.3390/app7030299 (2017).

Das, B., Pal, S. & Bag, S. Torque based defect detection and weld quality modelling in friction stir welding process. J. Manuf. Process. 27 , 8–17. https://doi.org/10.1016/j.jmapro.2017.03.012 (2017).

Petković, D. Prediction of laser welding quality by computational intelligence approaches. Optik 140 , 597–600. https://doi.org/10.1016/j.ijleo.2017.04.088 (2017).

Article   CAS   ADS   Google Scholar  

Yu, R., Han, J., Zhao, Z. & Bai, L. Real-time prediction of welding penetration mode and depth based on visual characteristics of weld pool in GMAW process. IEEE Access 8 , 81564–81573. https://doi.org/10.1109/ACCESS.2020.2990902 (2020).

Casalino, G., Campanelli, S. L. & Memola Capece Minutolo, F. Neuro-fuzzy model for the prediction and classification of the fused zone levels of imperfections in Ti6Al4V alloy butt weld. Adv. Mater. Sci. Eng. 2013 , 1–7. https://doi.org/10.1155/2013/952690 (2013).

Rout, A., Bbvl, D., Biswal, B. B. & Mahanta, G. B. A fuzzy-regression-PSO based hybrid method for selecting welding conditions in robotic gas metal arc welding. Assem. Autom. 40 , 601–612. https://doi.org/10.1108/AA-12-2019-0223 (2020).

Kim, K.-Y. & Ahmed, F. Semantic weldability prediction with RSW quality dataset and knowledge construction. Adv. Eng. Inf. 38 , 41–53. https://doi.org/10.1016/j.aei.2018.05.006 (2018).

AbuShanab, W. S., AbdElaziz, M., Ghandourah, E. I., Moustafa, E. B. & Elsheikh, A. H. A new fine-tuned random vector functional link model using Hunger games search optimizer for modeling friction stir welding process of polymeric materials. J. Mater. Res. Technol. 14 , 1482–1493. https://doi.org/10.1016/j.jmrt.2021.07.031 (2021).

Kennedy, J. Particle Swarm Optimization. In Encyclopedia of Machine Learning (eds. Sammut, C. & Webb, G. I.) 760–766. https://doi.org/10.1007/978-0-387-30164-8_630 (2011).

Sun, C. et al. Prediction method of concentricity and perpendicularity of aero engine multistage rotors based on PSO-BP neural network. IEEE Access 7 , 132271–132278. https://doi.org/10.1109/ACCESS.2019.2941118 (2019).

Holland, J. H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence (The MIT Press, New York, 1992). https://doi.org/10.7551/mitpress/1090.001.0001 .

Book   Google Scholar  

Download references

The work is supported by the National Natural Science Foundation of China under Grant (number 52075229\ 52371324), in part by the Provincial Natural Science Foundation of China under Grant (number KYCX20_3121), the Postgraduate Research & Practice Innovation Program of Jiangsu Province under Grant (number SJCX22_1923). Sponsored by Jiangsu Qinglan Project.

Author information

Authors and affiliations.

Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China

Jinfeng Liu, Yifa Cheng, Xuwen Jing & Yu Chen

Southeast University, Nanjing, 211189, China

Xiaojun Liu

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design. The first draft of the manuscript was written by J.L., manuscript review and editing were performed by Y.C., X.J., X.L. and Y.C. All authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.

Corresponding authors

Correspondence to Jinfeng Liu or Xuwen Jing .

Ethics declarations

Competing interests.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Liu, J., Cheng, Y., Jing, X. et al. Prediction and optimization method for welding quality of components in ship construction. Sci Rep 14 , 9353 (2024). https://doi.org/10.1038/s41598-024-59490-w

Download citation

Received : 10 January 2024

Accepted : 11 April 2024

Published : 23 April 2024

DOI : https://doi.org/10.1038/s41598-024-59490-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Quality prediction
  • Components welding
  • Welding quality

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

case study method data collection

At this time, we recommend all  Penn-affiliated  travel to Israel, West Bank, Gaza, and Lebanon be deferred.  If you are planning travel to any of these locations, please reach out to [email protected] for the most up to date risk assessment and insurance exclusions. As a reminder, it is required that all Penn-affiliated trips are registered in  MyTrips .  If you have questions, please contact  [email protected]

Utility Navigation

Utility links.

  • University of Pennsylvania
  • Office of the Provost
  • Penn Global

Secondary Nav Penn Global

  • For Penn Students
  • For Penn Faculty
  • For Alumni & Friends

Primary Nav Penn Global

Drawer menu penn global.

  • Back to main menu
  • Our Strategic Framework
  • Perry World House
  • Penn Biden Center
  • Penn Abroad
  • International Student & Scholar Services
  • Global Support Services
  • Penn in Africa
  • Penn in China
  • 2022 PLAC Symposium
  • Pulitzer International Reporting Student Fellowship
  • Connect with PLAC
  • Penn in Oceania
  • Penn in the Middle East
  • Penn in Northern America
  • Global at Penn's Schools
  • Global Centers & Programs
  • Global Engagement Fund
  • China Research and Engagement Fund
  • India Research and Engagement Fund
  • Holman Africa Research and Engagement Fund
  • Apply for a Convening Grant
  • Apply for a Research Grant
  • Manage My Grant
  • Grants Database

PENN GLOBAL RESEARCH & ENGAGEMENT GRANT PROGRAM 2024 Grant Program Awardees

Basic page sidebar menu penn global.

In 2024, Penn Global will support 24 new faculty-led research and engagement projects at a total funding level of $1.5 million.

The Penn Global Research and Engagement Grant Program prioritizes projects that bring together leading scholars and practitioners across the University community and beyond to develop new insight on significant global issues in key countries and regions around the world, a core pillar of Penn’s global strategic framework. 

PROJECTS SUPPORTED BY THE HOLMAN AFRICA RESEARCH AND ENGAGEMENT FUND

  • Global Medical Physics Training & Development Program  Stephen Avery, Perelman School of Medicine
  • Developing a Dakar Greenbelt with Blue-Green Wedges Proposal  Eugenie Birch, Weitzman School of Design
  • Emergent Judaism in Sub-Saharan Africa  Peter Decherney, School of Arts and Sciences / Sara Byala, School of Arts and Sciences
  • Determinants of Cognitive Aging among Older Individuals in Ghana  Irma Elo, School of Arts and Sciences
  • Disrupted Aid, Displaced Lives Guy Grossman, School of Arts and Sciences
  • A History of Regenerative Agriculture Practices from the Global South: Case Studies from Ethiopia, Kenya, and Zimbabwe Thabo Lenneiye, Kleinman Energy Center / Weitzman School of Design
  • Penn Computerized Neurocognitive Battery Use in Botswana Public Schools Elizabeth Lowenthal, Perelman School of Medicine
  • Podcasting South African Jazz Past and Present Carol Muller, School of Arts and Sciences
  • Lake Victoria Megaregion Study: Joint Lakefront Initiative Frederick Steiner, Weitzman School of Design
  • Leveraging an Open Source Software to Prevent and Contain AMR Jonathan Strysko, Perelman School of Medicine
  • Poverty reduction and children's neurocognitive growth in Cote d'Ivoire Sharon Wolf, Graduate School of Education
  • The Impacts of School Connectivity Efforts on Education Outcomes in Rwanda  Christopher Yoo, Carey Law School

PROJECTS SUPPORTED BY THE INDIA RESEARCH AND ENGAGEMENT FUND

  • Routes Beyond Conflict: A New Approach to Cultural Encounters in South Asia  Daud Ali, School of Arts and Sciences
  • Prioritizing Air Pollution in India’s Cities Tariq Thachil, Center for the Advanced Study of India / School of Arts and Sciences
  • Intelligent Voicebots to Help Indian Students Learn English Lyle Ungar, School of Engineering and Applied Sciences

PROJECTS SUPPORTED BT THE CHINA RESEARCH AND ENGAGEMENT FUND

  • Planning Driverless Cities in China Zhongjie Lin, Weitzman School of Design

PROJECTS SUPPORTED BY THE GLOBAL ENGAGEMENT FUND 

  • Education and Economic Development in Nepal Amrit Thapa, Graduate School of Education
  • Explaining Climate Change Regulation in Cities: Evidence from Urban Brazil Alice Xu, School of Arts and Sciences
  • Nurse Staffing Legislation for Scotland: Lessons for the U.S. and the U.K.  Eileen Lake, School of Nursing
  • Pathways to Education Development & Their Consequences: Finland, Korea, US Hyunjoon Park, School of Arts and Sciences
  • Engaged Scholarship in Latin America: Bridging Knowledge and Action Tulia Falleti, School of Arts and Sciences
  • Organizing Migrant Communities to Realize Rights in Palermo, Sicily  Domenic Vitiello, Weitzman School of Design
  • Exploiting Cultural Heritage in 21st Century Conflict   Fiona Cunningham, School of Arts and Sciences
  • Center for Integrative Global Oral Health   Alonso Carrasco-Labra, School of Dental Medicine

This first-of-its-kind Global Medical Physics Training and Development Program (GMPTDP) seeks to serve as an opportunity for PSOM and SEAS graduate students to enhance their clinical requirement with a global experience, introduce them to global career opportunities and working effectively in different contexts, and strengthens partnerships for education and research between US and Africa. This would also be an exceptional opportunity for pre-med/pre-health students and students interested in health tech to have a hands-on global experience with some of the leading professionals in the field. The project will include instruction in automated radiation planning through artificial intelligence (AI); this will increase access to quality cancer care by standardizing radiation planning to reduce inter-user variability and error, decreasing workload on the limited radiation workforce, and shortening time to treatment for patients. GMPTDP will offer a summer clinical practicum to Penn students during which time they will also collaborate with UGhana to implement and evaluate AI tools in the clinical workflow.

The proposal will address today’s pressing crises of climate change, land degradation, biodiversity loss, and growing economic disparities with a holistic approach that combines regional and small-scale actions necessary to achieve sustainability. It will also tackle a key issue found across sub-Saharan Africa, many emerging economies, and economically developed countries that struggle to control rapid unplanned urbanization that vastly outpaces the carrying capacity of the surrounding environment.

The regional portion of the project will create a framework for a greenbelt that halts the expansion of the metropolitan footprint. It will also protect the Niayes, an arable strip of land that produces over 80% of the country’s vegetables, from degradation. This partnership will also form a south-south collaboration to provide insights into best practices from a city experiencing similar pressures.

The small-scale portion of the project will bolster and create synergy with ongoing governmental and grassroots initiatives aimed at restoring green spaces currently being infilled or degraded in the capital. This will help to identify overlapping goals between endeavors, leading to collaboration and mobilizing greater funding possibilities instead of competing over the same limited resources. With these partners, we will identify and design Nature-based Solutions for future implementation.

Conduct research through fieldwork to examine questions surrounding Jewish identity in Africa. Research will be presented in e.g. articles, photographic images, and films, as well as in a capstone book. In repeat site-visits to Uganda, South Africa, Ghana, and Zimbabwe, we will conduct interviews with and take photographs of stakeholders from key communities in order to document their everyday lives and religious practices.

The overall aim of this project is the development of a nationally representative study on aging in Ghana. This goal requires expanding our network of Ghanian collaborators and actively engage them in research on aging. The PIs will build on existing institutional contacts in Ghana that include:

1). Current collaboration with the Navrongo Health Research Center (NCHR) on a pilot data collection on cognitive aging in Ghana (funded by a NIA supplement and which provides the matching funds for this Global Engagement fund grant application);

2) Active collaboration with the Regional Institute for Population Studies (RIPS), University of Ghana. Elo has had a long-term collaboration with Dr. Ayaga Bawah who is the current director of RIPS.

In collaboration with UNHCR, we propose studying the effects of a dramatic drop in the level of support for refugees, using a regression discontinuity design to survey 2,500 refugee households just above and 2,500 households just below the vulnerability score cutoff that determines eligibility for full rations. This study will identify the effects of aid cuts on the welfare of an important marginalized population, and on their livelihood adaptation strategies. As UNHCR faces budgetary cuts in multiple refugee-hosting contexts, our study will inform policymakers on the effects of funding withdrawal as well as contribute to the literature on cash transfers.

The proposed project, titled "A History of Regenerative Agriculture Practices from the Global South: Case Studies from Ethiopia, Kenya, and Zimbabwe," aims to delve into the historical and contemporary practices of regenerative agriculture in sub-Saharan Africa. Anticipated Outputs and Outcomes:

1. Research Paper: The primary output of this project will be a comprehensive research paper. This paper will draw from a rich pool of historical and contemporary data to explore the history of regenerative agriculture practices in Ethiopia, Kenya, and Zimbabwe. It will document the indigenous knowledge and practices that have sustained these regions for generations.

2. Policy Digest: In addition to academic research, the project will produce a policy digest. This digest will distill the research findings into actionable insights for policymakers, both at the national and international levels. It will highlight the benefits of regenerative agriculture and provide recommendations for policy frameworks that encourage its adoption.

3. Long-term Partnerships: The project intends to establish long-term partnerships with local and regional universities, such as Great Lakes University Kisumu, Kenya. These partnerships will facilitate knowledge exchange, collaborative research, and capacity building in regenerative agriculture practices. Such collaborations align with Penn Global's goal of strengthening institutional relationships with African partners.

The Penn Computerized Neurocognitive Battery (PCNB) was developed at the University of Pennsylvania by Dr. Ruben C. Gur and colleagues to be administered as part of a comprehensive neuropsychiatric assessment. Consisting of a series of cognitive tasks that help identify individuals’ cognitive strengths and weaknesses, it has recently been culturally adapted and validated by our team for assessment of school-aged children in Botswana . The project involves partnership with the Botswana Ministry of Education and Skills Development (MoESD) to support the rollout of the PCNB for assessment of public primary and secondary school students in Botswana. The multidisciplinary Penn-based team will work with partners in Botswana to guide the PCNB rollout, evaluate fidelity to the testing standards, and track student progress after assessment and intervention. The proposed project will strengthen a well-established partnership between Drs. Elizabeth Lowenthal and J. Cobb Scott from the PSOM and in-country partners. Dr. Sharon Wolf, from Penn’s Graduate School of Education, is an expert in child development who has done extensive work with the Ministry of Education in Ghana to support improvements in early childhood education programs. She is joining the team to provide the necessary interdisciplinary perspective to help guide interventions and evaluations accompanying this new use of the PCNB to support this key program in Africa.

This project will build on exploratory research completed by December 24, 2023 in which the PI interviewed about 35 South Africans involved in jazz/improvised music mostly in Cape Town: venue owners, curators, creators, improvisers.

  • Podcast series with 75-100 South African musicians interviewed with their music interspersed in the program.
  • 59 minute radio program with extended excerpts of music inserted into the interview itself.
  • Create a center of knowledge about South African jazz—its sound and its stories—building knowledge globally about this significant diasporic jazz community
  • Expand understanding of “jazz” into a more diffuse area of improvised music making that includes a wide range of contemporary indigenous music and art making
  • Partner w Lincoln Center Jazz (and South African Tourism) to host South Africans at Penn

This study focuses on the potential of a Megaregional approach for fostering sustainable development, economic growth, and social inclusion within the East African Community (EAC), with a specific focus on supporting the development of A Vision for An Inclusive Joint Lakefront across the 5 riparian counties in Kenya.

By leveraging the principles of Megaregion development, this project aims to create a unified socio-economic, planning, urbanism, cultural, and preservation strategy that transcends county boundaries and promotes collaboration further afield, among the EAC member countries surrounding the Lake Victoria Basin.

Anticipated Outputs and Outcomes:

1. Megaregion Conceptual Framework: The project will develop a comprehensive Megaregion Conceptual Framework for the Joint Lakefront region in East Africa. This framework, which different regions around the world have applied as a way of bridging local boundaries toward a unified regional vision will give the Kisumu Lake region a path toward cooperative, multi-jurisdictional planning. The Conceptual Framework will be both broad and specific, including actionable strategies, projects, and initiatives aimed at sustainable development, economic growth, social inclusion, and environmental stewardship.

2. Urbanism Projects: Specific urbanism projects will be proposed for key urban centers within the Kenyan riparian counties. These projects will serve as tangible examples of potential improvements and catalysts for broader development efforts.

3. Research Publication: The findings of the study will be captured in a research publication, contributing to academic discourse and increasing Penn's visibility in the field of African urbanism and sustainable development

Antimicrobial resistance (AMR) has emerged as a global crisis, causing more deaths than HIV/AIDS and malaria worldwide. By engaging in a collaborative effort with the Botswana Ministry of Health’s data scientists and experts in microbiology, human and veterinary medicine, and bioinformatics, we will aim to design new electronic medical record system modules that will:

Aim 1: Support the capturing, reporting, and submission of microbiology data from sentinel surveillance laboratories as well as pharmacies across the country

Aim 2: Develop data analytic dashboards for visualizing and characterizing regional AMR and AMC patterns

Aim 3: Submit AMR and AMC data to regional and global surveillance programs

Aim 4: Establish thresholds for alert notifications when disease activity exceeds expected incidence to serve as an early warning system for outbreak detection.

  Using a novel interdisciplinary approach that bridges development economics, psychology, and neuroscience, the overall goal of this project is to improve children's development using a poverty-reduction intervention in Cote d'Ivoire (CIV). The project will directly measure the impacts of cash transfers (CTs) on neurocognitive development, providing a greater understanding of how economic interventions can support the eradication of poverty and ensure that all children flourish and realize their full potential. The project will examine causal mechanisms by which CTs support children’s healthy neurocognitive development and learning outcomes through the novel use of an advanced neuroimaging tool, functional Near Infrared Spectroscopy (fNIRS), direct child assessments, and parent interviews.

The proposed research, the GIGA initiative for Improving Education in Rwanda (GIER), will produce empirical evidence on the impact of connecting schools on education outcomes to enable Rwanda to better understand how to accelerate the efforts to bring connectivity to schools, how to improve instruction and learning among both teachers and students, and whether schools can become internet hubs capable of providing access e-commerce and e-government services to surrounding communities. In addition to evaluating the impact of connecting schools on educational outcomes, the research would also help determine which aspects of the program are critical to success before it is rolled out nationwide.

Through historical epigraphic research, the project will test the hypothesis that historical processes and outcomes in the 14th century were precipitated by a series of related global and local factors and that, moreover, an interdisciplinary and synergistic analysis of these factors embracing climatology, hydrology, epidemiology linguistics and migration will explain the transformation of the cultural, religious and social landscapes of the time more effectively than the ‘clash of civilizations’ paradigm dominant in the field. Outputs include a public online interface for the epigraphic archive; a major international conference at Penn with colleagues from partner universities (Ghent, Pisa, Edinburgh and Penn) as well as the wider South Asia community; development of a graduate course around the research project, on multi-disciplinary approaches to the problem of Hindu-Muslim interaction in medieval India; and a public facing presentation of our findings and methods to demonstrate the path forward for Indian history. Several Penn students, including a postdoc, will be actively engaged.  

India’s competitive electoral arena has failed to generate democratic accountability pressures to reduce toxic air. This project seeks to broadly understand barriers to such pressures from developing, and how to overcome them. In doing so, the project will provide the first systematic study of attitudes and behaviors of citizens and elected officials regarding air pollution in India. The project will 1) conduct in-depth interviews with elected local officials in Delhi, and a large-scale survey of elected officials in seven Indian states affected by air pollution, and 2) partner with relevant civil society organizations, international bodies like the United Nations Environment Program (UNEP), domain experts at research centers like the Public Health Foundation of India (PHFI), and local civic organizations (Janagraaha) to evaluate a range of potential strategies to address pollution apathy, including public information campaigns with highly affected citizens (PHFI), and local pollution reports for policymakers (Janagraaha).

The biggest benefit from generative AI such as GPT, will be the widespread availability of tutoring systems to support education. The project will use this technology to build a conversational voicebot to support Indian students in learning English. The project will engage end users (Indian tutors and their students) in the project from the beginning. The initial prototype voice-driven conversational system will be field-tested in Indian schools and adapted. The project includes 3 stages of development:

1) Develop our conversational agent. Specify the exact initial use case and Conduct preliminary user testing.

2) Fully localize to India, addressing issues identified in Phase 1 user testing.

3) Do comprehensive user testing with detailed observation of 8-12 students using the agent for multiple months; conduct additional assessments of other stakeholders.

The project partners with Ashoka University and Pratham over all three stages, including writing scholarly papers.

Through empirical policy analysis and data-based scenario planning, this project actively contributes to this global effort by investigating planning and policy responses to autonomous transportation in the US and China. In addition to publishing several research papers on this subject, the PI plans to develop a new course and organize a forum at PWCC in 2025. These initiatives are aligned with an overarching endeavor that the PI leads at the Weitzman School of Design, which aims to establish a Future Cities Lab dedicated to research and collaboration in the pursuit of sustainable cities.

This study aims to fill this gap through a more humanistic approach to measuring the impact of education on national development. Leveraging a mixed methods research design consisting of analysis of quantitative data for trends over time, observations of schools and classrooms, and qualitative inquiry via talking to people and hearing their stories, we hope to build a comprehensive picture of educational trends in Nepal and their association with intra-country development. Through this project we strive to better inform the efforts of state authorities and international organizations working to enhance sustainable development within Nepal, while concurrently creating space and guidance for further impact analyses. Among various methods of dissemination of the study’s findings, one key goal is to feed this information into writing a book on this topic.

Developing cities across the world have taken the lead in adopting local environmental regulation. Yet standard models of environmental governance begin with the assumption that local actors should have no incentives for protecting “the commons.” Given the benefits of climate change regulation are diffuse, individual local actors face a collective action problem. This project explores why some local governments bear the costs of environmental regulation while most choose to free-ride. The anticipated outputs of the project include qualitative data that illuminate case studies and the coding of quantitative spatial data sets for studying urban land-use. These different forms of data collection will allow me to develop and test a theoretical framework for understanding when and why city governments adopt environmental policy.

The proposed project will develop new insights on the issue of legislative solutions to the nurse staffing crisis, which will pertain to many U.S. states and U.K. countries. The PI will supervise the nurse survey data collection and to meet with government and nursing association stakeholders to plan the optimal preparation of reports and dissemination of results. The anticipated outputs of the project are a description of variation throughout Scotland in hospital nursing features, including nurse staffing, nurse work environments, extent of adherence to the Law’s required principles, duties, and method, and nurse intent to leave. The outcomes will be the development of capacity for sophisticated quantitative research by Scottish investigators, where such skills are greatly needed but lacking.  

The proposed project will engage multi-cohort, cross-national comparisons of educational-attainment and labor-market experiences of young adults in three countries that dramatically diverge in how they have developed college education over the last three decades: Finland, South Korea and the US. It will produce comparative knowledge regarding consequences of different pathways to higher education, which has significant policy implications for educational and economic inequality in Finland, Korea, the US, and beyond. The project also will lay the foundation for ongoing collaboration among the three country teams to seek external funding for sustained collaboration on educational analyses.

With matching funds from PLAC and CLALS, we will jointly fund four scholars from diverse LAC countries to participate in workshops to engage our community regarding successful practices of community-academic partnerships.

These four scholars and practitioners from Latin America, who are experts on community-engaged scholarship, will visit the Penn campus during the early fall of 2024. As part of their various engagements on campus, these scholars will participate after the workshops as key guest speakers in the 7th edition of the Penn in Latin America and the Caribbean (PLAC) Conference, held on October 11, 2024, at the Perry World House. The conference will focus on "Public and Community Engaged Scholarship in Latin America, the Caribbean, and their Diasporas."

Palermo, Sicily, has been a leading center of migrant rights advocacy and migrant civic participation in the twenty-first century. This project will engage an existing network of diverse migrant community associations and anti-mafia organizations in Palermo to take stock of migrant rights and support systems in the city. Our partner organizations, research assistants, and cultural mediators from different communities will design and conduct a survey and interviews documenting experiences, issues and opportunities related to various rights – to asylum, housing, work, health care, food, education, and more. Our web-based report will include recommendations for city and regional authorities and other actors in civil society. The last phase of our project will involve community outreach and organizing to advance these objectives. The web site we create will be designed as the network’s information center, with a directory of civil society and services, updating an inventory not current since 2014, which our partner Diaspore per la Pace will continue to update.

This interdisciplinary project has four objectives: 1) to investigate why some governments and non-state actors elevated cultural heritage exploitation (CHX) to the strategic level of warfare alongside nuclear weapons, cyberattacks, political influence operations and other “game changers”; 2) which state or non-state actors (e.g. weak actors) use heritage for leverage in conflict and why; and 3) to identify the mechanisms through which CHX coerces an adversary (e.g. catalyzing international involvement); and 4) to identify the best policy responses for non-state actors and states to address the challenge of CHX posed by their adversaries, based on the findings produced by the first three objectives.

Identify the capacity of dental schools, organizations training oral health professionals and conducting oral health research to contribute to oral health policies in the WHO Eastern Mediterranean region, identify the barriers and facilitators to engage in OHPs, and subsequently define research priority areas for the region in collaboration with the WHO, oral health academia, researchers, and other regional stakeholders.

3539 Locust Walk University of Pennsylvania Philadelphia, PA 19104

[email protected]

©2024 University of Pennsylvania, Philadelphia, PA 19104   

Footer Menu

  • Report Accessibility Issues and Get Help
  • Privacy Policy

IMAGES

  1. The case study data collection and analysis process (an author's view

    case study method data collection

  2. How to Customize a Case Study Infographic With Animated Data

    case study method data collection

  3. Case Study Data Data Collection Methods

    case study method data collection

  4. How To Do Case Study Analysis?

    case study method data collection

  5. Case Study

    case study method data collection

  6. PPT

    case study method data collection

VIDEO

  1. Case Study Method In Hindi || वैयक्तिक अध्ययन विधि || D.Ed SE (I.D) || All Students || Special BSTC

  2. Bridge Survey || Survey Camp, Convert Theodolite data ENZ and then autocad

  3. Day-2 Case Study Method for better Teaching

  4. case study method and its importance /limitations. MS0 :002 (sociology)IGNOU

  5. EP 1: The Case Study Method in Hiring

  6. Qualitative research

COMMENTS

  1. Case Study Methodology of Qualitative Research: Key Attributes and

    In a case study research, multiple methods of data collection are used, as it involves an in-depth study of a phenomenon. It must be noted, as highlighted by Yin , a case study is not a method of data collection, rather is a research strategy or design to study a social unit. Creswell (2014 ...

  2. Case Study

    The data collection method should be selected based on the research questions and the nature of the case study phenomenon. Analyze the data: The data collected from the case study should be analyzed using various techniques, such as content analysis, thematic analysis, or grounded theory. The analysis should be guided by the research questions ...

  3. (PDF) Collecting data through case studies

    The case study is a data collection method in which in-depth descriptive information. about specific entities, or cases, is collected, organized, interpreted, and presented in a. narrative format ...

  4. Case Study Methods and Examples

    The purpose of case study research is twofold: (1) to provide descriptive information and (2) to suggest theoretical relevance. Rich description enables an in-depth or sharpened understanding of the case. It is unique given one characteristic: case studies draw from more than one data source. Case studies are inherently multimodal or mixed ...

  5. Case Study Method: A Step-by-Step Guide for Business Researchers

    Case study method is the most widely used method in academia for researchers interested in ... Greenberg R. (2000). Avoiding common pitfalls in qualitative data collection and transcription. Qualitative Health Research, 10, 703-707. Crossref. PubMed. ISI. Google Scholar. Eriksson P., Kovalainen A. (2015). Qualitative methods in business ...

  6. What is a Case Study?

    A case study protocol outlines the procedures and general rules to be followed during the case study. This includes the data collection methods to be used, the sources of data, and the procedures for analysis. Having a detailed case study protocol ensures consistency and reliability in the study.

  7. What Is a Case Study?

    A case study is a detailed study of a specific subject, such as a person, group, place, event, organization, or phenomenon. Case studies are commonly used in social, educational, clinical, and business research. A case study research design usually involves qualitative methods, but quantitative methods are sometimes also used.

  8. PDF A (VERY) BRIEF REFRESHER ON THE CASE STUDY METHOD

    the case study method favors the collection of data in natural settings, compared with relying on "derived" data (Bromley, 1986, p. 23)—for example, responses to a researcher's instruments in an experiment or responses to questionnaires in a survey.

  9. Data Collection Methods

    To decide on a sampling method you will need to consider factors like the required sample size, accessibility of the sample, and time frame of the data collection. Standardising procedures. If multiple researchers are involved, write a detailed manual to standardise data collection procedures in your study.

  10. Collecting data through case studies

    This eighth article in the Performance Technologist's Toolbox series introduces the data collection method of case studies. The article describes the decisions that need to be made in planning case study research and then presents examples of how case studies can be used in several performance technology applications. The advantages and ...

  11. Toward Developing a Framework for Conducting Case Study Research

    Yin (1989, 1994) suggests three principles of data collection for case studies: use multiple sources of data, create a case study database, and maintain a chain of evidence. ... Types of Case Study Research Methods of Gathering Data Data Analysis; Boundaries of R&D collaboration (Back & Kohtamaki, 2015) Theory oriented: Theory extension/refinement:

  12. Statistics

    Statistics - Data collection - Case Study Method - Case study research is a qualitative research method that is used to examine contemporary real-life situations and apply the findings of the case to the problem under study. Case studies involve a detailed contextual analysis of a limited number of events or conditions and their relationships.

  13. Continuing to enhance the quality of case study methodology in health

    Using multiple data collection methods is a key characteristic of all case study methodology; it enhances the credibility of the findings by allowing different facets and views of the phenomenon to be explored. 23 Common methods include interviews, focus groups, observation, and document analysis. 5,37 By seeking patterns within and across data ...

  14. Case Study Research Method in Psychology

    The case study is not a research method, but researchers select methods of data collection and analysis that will generate material suitable for case studies. Freud (1909a, 1909b) conducted very detailed investigations into the private lives of his patients in an attempt to both understand and help them overcome their illnesses.

  15. Four Steps to Analyse Data from a Case Study Method

    data collected from a case study method. These steps do not imply that this approach is the only way case study data can be analysed (Barry, 1998) and it is recommended that they be used in conjunction with the overall case study design frameworks proposed by Yin (1994); and Miles and Huberman (1994). Create data repository

  16. Data Collection Methods and Tools for Research; A Step-by-Step Guide to

    the other hand, although a suitable data collection method helps to plan good research, it cannot necessarily guarantee the overall success of the research project (Olsen, 2012). II. TYPES OF DATA Before selecting a data collection method, the type of data that is required for the study should be determined (Kabir, 2016).

  17. (PDF) Data Collection Methods and Tools for Research; A Step-by-Step

    One of the main stages in a research study is data collection that enables the researcher to find answers to research questions. Data collection is the process of collecting data aiming to gain ...

  18. Data Collection

    Case Studies. Case studies involve in-depth analysis of a single individual, organization, or event. Case studies are used to gain detailed information about a specific phenomenon. ... Determine the data collection method: Once you have identified the data sources, you need to determine the data collection method. This could be through online ...

  19. The Case Study: Methods of Data Collection

    The case study involved TPWW Company and its consumers and semi-structured interview were selected to collect the primary data throughout the water industry professionals, and members of the public in Greater Tehran . Table 6.2 illustrates the linkages between the research objectives and the data collection methods.

  20. Planning Qualitative Research: Design and Decision Making for New

    The case study method is particularly useful for researching educational interventions because it provides a rich description of all the interrelated factors. ... Much like case studies, data collection may include a variety of types of sources such as participant observation, interviews, documents, artifacts, and immersion in the cultural ...

  21. Collecting data through case studies

    This eighth article in the Performance Technologist's Toolbox series introduces the data collection method of case studies. The article describes the decisions that need to be made in planning case study research and then presents examples of how case studies can be used in several performance technology applications. The advantages and ...

  22. Case Study Method

    The case study method uses investigatory research as a way to collect data about specific demographics. This approach can apply to individuals, businesses, groups, or events. Each participant receives an equal amount of participation, offering information for collection that can then find new insights into specific trends, ideas, of hypotheses.

  23. Data Collection

    Data Collection | Definition, Methods & Examples. Published on June 5, 2020 by Pritha Bhandari.Revised on June 21, 2023. Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem.

  24. Data Analytics Case Study Guide 2024

    A data analytics case study comprises essential elements that structure the analytical journey: Problem Context: A case study begins with a defined problem or question. It provides the context for the data analysis, setting the stage for exploration and investigation.. Data Collection and Sources: It involves gathering relevant data from various sources, ensuring data accuracy, completeness ...

  25. we're changing the way we study tech adoption

    Shifting our tech adoption studies to NPORS ensures we're keeping up with the latest advances in the Center's methods toolkit, with quality at the forefront of this important work. The internet hasn't just transformed Americans' everyday lives - it's also transformed the way researchers study its impact.

  26. Guidelines for data collection on energy performance of higher

    The study proposed a system which tackles the problem of data collection in university buildings. The proposed database covered data on the building's shape, data on the hourly and quarter-hourly consumption of power and gas, operational information, and the effect of weather to assist universities in creating nearly zero-energy buildings.

  27. Partition refinement of WorldPop population spatial distribution data

    The study established a partition model for public population spatial distribution data and auxiliary data at the grid-unit level and, then, refined it to smaller grid units. This method effectively utilizes the public population spatial distribution data and solves the problem of the dataset being not sufficiently accurate to describe small ...

  28. Prediction and optimization method for welding quality of ...

    Based on the industrial IoT framework of data acquisition and processing of ship component welding, The welding data collection method is validated, as shown in Fig. 5 (The clearer version is ...

  29. Water

    The non-tectonic deformation caused by hydrological loads is an important influencing factor in GNSS vertical displacement. Limited by the temporal and spatial resolution of global models and model errors, the hydrological load results calculated by traditional methods are difficult to meet the high temporal and spatial resolution requirements of small to medium-scale regions. This paper ...

  30. 2024 Grant Program Awardees

    The anticipated outputs of the project include qualitative data that illuminate case studies and the coding of quantitative spatial data sets for studying urban land-use. These different forms of data collection will allow me to develop and test a theoretical framework for understanding when and why city governments adopt environmental policy.