Detail of a painting depicting the landscape of New Mexico with mountains in the distance

Explore millions of high-quality primary sources and images from around the world, including artworks, maps, photographs, and more.

Explore migration issues through a variety of media types

  • Part of Street Art Graphics
  • Part of The Journal of Economic Perspectives, Vol. 34, No. 1 (Winter 2020)
  • Part of Cato Institute (Aug. 3, 2021)
  • Part of University of California Press
  • Part of Open: Smithsonian National Museum of African American History & Culture
  • Part of Indiana Journal of Global Legal Studies, Vol. 19, No. 1 (Winter 2012)
  • Part of R Street Institute (Nov. 1, 2020)
  • Part of Leuven University Press
  • Part of UN Secretary-General Papers: Ban Ki-moon (2007-2016)
  • Part of Perspectives on Terrorism, Vol. 12, No. 4 (August 2018)
  • Part of Leveraging Lives: Serbia and Illegal Tunisian Migration to Europe, Carnegie Endowment for International Peace (Mar. 1, 2023)
  • Part of UCL Press

Harness the power of visual materials—explore more than 3 million images now on JSTOR.

Enhance your scholarly research with underground newspapers, magazines, and journals.

Explore collections in the arts, sciences, and literature from the world’s leading museums, archives, and scholars.

What is Database Search?

Harvard Library licenses hundreds of online databases, giving you access to academic and news articles, books, journals, primary sources, streaming media, and much more.

The contents of these databases are only partially included in HOLLIS. To make sure you're really seeing everything, you need to search in multiple places. Use Database Search to identify and connect to the best databases for your topic.

In addition to digital content, you will find specialized search engines used in specific scholarly domains.

Related Services & Tools

Tomorrow´s Research Today

SSRN provides 1,481,614 preprints and research papers from 1,918,444 researchers in over 65 disciplines.

Race and Mpox

New Networks

Research disciplines, applied sciences.

APPLIED SCIENCES are those disciplines, including applied and pure mathematics, that apply existing scientific knowledge to develop practical applications.

Health Sciences

HEALTH SCIENCES are those disciplines that address the use of science and technology to the delivery of healthcare.

HUMANITIES are those disciplines that investigate human constructs, cultures and concerns, using critical and analytical approaches.

Life Sciences

LIFE SCIENCES are those disciplines that study living organisms, their life processes, and their relationships to each other and their environment.

Physical Sciences

PHYSICAL SCIENCES are those disciplines that study natural sciences, dealing with nonliving materials.

Social Sciences

SOCIAL SCIENCES are those disciplines that study (a) institutions and functioning of human society and the interpersonal relationships of individuals as members of society; (b) a particular phase or aspect of human society.

Blog

Products and Services

Recent announcements.

SSRN is devoted to the rapid worldwide dissemination of preprints and research papers and is composed of a number of specialized research networks.

Special thanks to:

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Trending Articles

  • The multi-stage plasticity in the aggression circuit underlying the winner effect. Yan R, et al. Cell. 2024. PMID: 39406242
  • IRE1α silences dsRNA to prevent taxane-induced pyroptosis in triple-negative breast cancer. Xu L, et al. Cell. 2024. PMID: 39419025
  • Multiscale drug screening for cardiac fibrosis identifies MD2 as a therapeutic target. Zhang H, et al. Cell. 2024. PMID: 39413786
  • A two-front nutrient supply environment fuels small intestinal physiology through differential regulation of nutrient absorption and host defense. Zhang J, et al. Cell. 2024. PMID: 39427662
  • Breast Cancer in Users of Levonorgestrel-Releasing Intrauterine Systems. Mørch LS, et al. JAMA. 2024. PMID: 39412770

Latest Literature

  • Am J Med (1)
  • Eur J Clin Nutr (2)
  • J Biol Chem (5)
  • Metabolism (1)
  • Methods Mol Biol (22)
  • Nat Commun (31)
  • Nat Rev Cancer (1)
  • Sci Rep (140)

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

American Psychological Association Logo

APA PsycInfo ®

The premier abstracting and indexing database covering the behavioral and social sciences from the authority in psychology.

Support research goals

Institutional access to APA PsycInfo provides a single source of vetted, authoritative research for users across the behavioral and social sciences. Students and researchers enjoy seamless access to cutting-edge and historical content with citations in APA Style ® .

The newest features available within APA PsycInfo leverage artificial intelligence and machine learning to equip users with a personalized research assistant that helps monitor trends, explore content analytics, and gain one-click access to full text within a centralized, essential source of credible psychology research.

Celebrating 55 years

For over 55 years, APA PsycInfo has been the most trusted index of psychological science in the world. With more than 5,000,000 interdisciplinary bibliographic records, our database delivers targeted discovery of credible and comprehensive research across the full spectrum of behavioral and social sciences. This indispensable resource continues to enhance the discovery and usage of essential psychological research to support students, scientists, and educators. Explore the past, present, and future of psychology research .

APA PsycInfo at a glance

  • Over 5,000,000 peer-reviewed records
  • 144 million cited references
  • Spanning 600 years of content
  • Updated twice-weekly
  • Research in 30 languages from 50 countries
  • Records from 2,400 journals
  • Content from journal articles, book chapters, and dissertations
  • AI and machine learning-powered research assistance

Support your campus community beyond psychology with APA PsycInfo’s broad subject coverage:

  • Artificial Intelligence
  • Linguistics
  • Neuroscience
  • Pharmacology
  • Political science
  • Social work

Institutional trial

Evaluate this resource free for 30 days to determine if it meets your library’s research needs.

Access options

Select from individual subscriptions or institutional licenses on your platform of choice.

Find webinars, tutorials, and guides to help promote your library’s subscription.

Key benefits of APA PsycInfo

5 million records & growing!

Learn what’s new

research paper in database

AI-powered research tools

Join a webinar on new features courtesy of your access to APA PsycInfo

Senior citizen in a wheelchair on a laptop video call

APA PsycInfo webinars

Help users search smarter this semester from APA training experts

Browse APA Databases and electronic products by publication type or subscriber type.

View Product Guide

More about APA PsycInfo

  • APA PsycInfo FAQs
  • Sample records
  • Coverage List
  • Full-Text Options
  • APA PsycInfo Publishers

APA PsycInfo research services

Simplify the research process for your users with this personalized research assistant, a free tool courtesy of your institution's subscription to APA PsycInfo. This service leverages AI and machine learning to ease access to full text, content analytics, and discovery of the latest behavioral sciences research.

Get Started

APA Databases

Find full-text articles, book chapters, and more using APA PsycNet ® , the only search platform designed specially to deliver APA content.

MORE ABOUT APA PSYCNET

APA Publishing Blog

The blog is your source for training materials and information about APA’s databases and electronic resources. Stay up-to-date on training sessions, new features and content, and resources to support your subscription.

Follow blog

Stay connected

Twitter icon

An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List

Springer Nature - PMC COVID-19 Collection logo

Advances in database systems education: Methods, tools, curricula, and way forward

Muhammad ishaq, muhammad shoaib farooq, muhammad faraz manzoor, uzma farooq, kamran abid, mamoun abu helou.

  • Author information
  • Article notes
  • Copyright and License information

Corresponding author.

Accepted 2022 Aug 16; Issue date 2023.

This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

Fundamentals of Database Systems is a core course in computing disciplines as almost all small, medium, large, or enterprise systems essentially require data storage component. Database System Education (DSE) provides the foundation as well as advanced concepts in the area of data modeling and its implementation. The first course in DSE holds a pivotal role in developing students’ interest in this area. Over the years, the researchers have devised several different tools and methods to teach this course effectively, and have also been revisiting the curricula for database systems education. In this study a Systematic Literature Review (SLR) is presented that distills the existing literature pertaining to the DSE to discuss these three perspectives for the first course in database systems. Whereby, this SLR also discusses how the developed teaching and learning assistant tools, teaching and assessment methods and database curricula have evolved over the years due to rapid change in database technology. To this end, more than 65 articles related to DSE published between 1995 and 2022 have been shortlisted through a structured mechanism and have been reviewed to find the answers of the aforementioned objectives. The article also provides useful guidelines to the instructors, and discusses ideas to extend this research from several perspectives. To the best of our knowledge, this is the first research work that presents a broader review about the research conducted in the area of DSE.

Keywords: Higher education, Database, Education, Database curriculum, Tools, Teaching methods

Introduction

Database systems play a pivotal role in the successful implementation of the information systems to ensure the smooth running of many different organizations and companies (Etemad & Küpçü, 2018 ; Morien, 2006 ). Therefore, at least one course about the fundamentals of database systems is taught in every computing and information systems degree (Nagataki et al., 2013 ). Database System Education (DSE) is concerned with different aspects of data management while developing software (Park et al., 2017 ). The IEEE/ACM computing curricula guidelines endorse 30–50 dedicated hours for teaching fundamentals of design and implementation of database systems so as to build a very strong theoretical and practical understanding of the DSE topics (Cvetanovic et al., 2010 ).

Practically, most of the universities offer one user-oriented course at undergraduate level that covers topics related to the data modeling and design, querying, and a limited number of hours on theory (Conklin & Heinrichs, 2005 ; Robbert & Ricardo, 2003 ), where it is often debatable whether to utilize a design-first or query-first approach. Furthermore, in order to update the course contents, some recent trends, including big data and the notion of NoSQL should also be introduced in this basic course (Dietrich et al., 2008 ; Garcia-Molina, 2008 ). Whereas, the graduate course is more theoretical and includes topics related to DB architecture, transactions, concurrency, reliability, distribution, parallelism, replication, query optimization, along with some specialized classes.

Researchers have designed a variety of tools for making different concepts of introductory database course more interesting and easier to teach and learn interactively (Brusilovsky et al., 2010 ) either using visual support (Nagataki et al., 2013 ), or with the help of gamification (Fisher & Khine, 2006 ). Similarly, the instructors have been improvising different methods to teach (Abid et al., 2015 ; Domínguez & Jaime, 2010 ) and evaluate (Kawash et al., 2020 ) this theoretical and practical course. Also, the emerging and hot topics such as cloud computing and big data has also created the need to revise the curriculum and methods to teach DSE (Manzoor et al., 2020 ).

The research in database systems education has evolved over the years with respect to modern contents influenced by technological advancements, supportive tools to engage the learners for better learning, and improvisations in teaching and assessment methods. Particularly, in recent years there is a shift from self-describing data-driven systems to a problem-driven paradigm that is the bottom-up approach where data exists before being designed. This mainly relies on scientific, quantitative, and empirical methods for building models, while pushing the boundaries of typical data management by involving mathematics, statistics, data mining, and machine learning, thus opening a multidisciplinary perspective. Hence, it is important to devote a few lectures to introducing the relevance of such advance topics.

Researchers have provided useful review articles on other areas including Introductory Programming Language (Mehmood et al., 2020 ), use of gamification (Obaid et al., 2020 ), research trends in the use of enterprise service bus (Aziz et al., 2020 ), and the role of IoT in agriculture (Farooq et al., 2019 , 2020 ) However, to the best of our knowledge, no such study was found in the area of database systems education. Therefore, this study discusses research work published in different areas of database systems education involving curricula, tools, and approaches that have been proposed to teach an introductory course on database systems in an effective manner. The rest of the article has been structured in the following manner: Sect.  2 presents related work and provides a comparison of the related surveys with this study. Section  3 presents the research methodology for this study. Section  4 analyses the major findings of the literature reviewed in this research and categorizes it into different important aspects. Section  5 represents advices for the instructors and future directions. Lastly, Sect.  6 concludes the article.

Related work

Systematic Literature Reviews have been found to be a very useful artifact for covering and understanding a domain. A number of interesting review studies have been found in different fields (Farooq et al., 2021 ; Ishaq et al., 2021 ). Review articles are generally categorized into narrative or traditional reviews (Abid et al., 2016 ; Ramzan et al., 2019 ), systematic literature review (Naeem et al., 2020 ) and meta reviews or mapping study (Aria & Cuccurullo, 2017 ; Cobo et al., 2012 ; Tehseen et al., 2020 ). This study presents a systematic literature review on database system education.

The database systems education has been discussed from many different perspectives which include teaching and learning methods, curriculum development, and the facilitation of instructors and students by developing different tools. For instance, a number of research articles have been published focusing on developing tools for teaching database systems course (Abut & Ozturk, 1997 ; Connolly et al., 2005 ; Pahl et al., 2004 ). Furthermore, few authors have evaluated the DSE tools by conducting surveys and performing empirical experiments so as to gauge the effectiveness of these tools and their degree of acceptance among important stakeholders, teachers and students (Brusilovsky et al., 2010 ; Nelson & Fatimazahra, 2010 ). On the other hand, some case studies have also been discussed to evaluate the effectiveness of the improvised approaches and developed tools. For example, Regueras et al. ( 2007 ) presented a case study using the QUEST system, in which e-learning strategies are used to teach the database course at undergraduate level, while, Myers and Skinner ( 1997 ) identified the conflicts that arise when theories in text books regarding the development of databases do not work on specific applications.

Another important facet of DSE research focuses on the curriculum design and evolution for database systems, whereby (Alrumaih, 2016 ; Bhogal et al., 2012 ; Cvetanovic et al., 2010 ; Sahami et al., 2011 ) have proposed solutions for improvements in database curriculum for the better understanding of DSE among the students, while also keeping the evolving technology into the perspective. Similarly, Mingyu et al. ( 2017 ) have shared their experience in reforming the DSE curriculum by adding topics related to Big Data. A few authors have also developed and evaluated different tools to help the instructors teaching DSE.

There are further studies which focus on different aspects including specialized tools for specific topics in DSE (Mcintyre et al, 1995 ; Nelson & Fatimazahra, 2010 ). For instance, Mcintyre et al. ( 1995 ) conducted a survey about using state of the art software tools to teach advanced relational database design courses at Cleveland State University. However, the authors did not discuss the DSE curricula and pedagogy in their study. Similarly, a review has been conducted by Nelson and Fatimazahra ( 2010 ) to highlight the fact that the understanding of basic knowledge of database is important for students of the computer science domain as well as those belonging to other domains. They highlighted the issues encountered while teaching the database course in universities and suggested the instructors investigate these difficulties so as to make this course more effective for the students. Although authors have discussed and analyzed the tools to teach database, the tools are yet to be categorized according to different methods and research types within DSE. There also exists an interesting systematic mapping study by Taipalus and Seppänen ( 2020 ) that focuses on teaching SQL which is a specific topic of DSE. Whereby, they categorized the selected primary studies into six categories based on their research types. They utilized directed content analysis, such as, student errors in query formulation, characteristics and presentation of the exercise database, specific or non-specific teaching approach suggestions, patterns and visualization, and easing teacher workload.

Another relevant study that focuses on collaborative learning techniques to teach the database course has been conducted by Martin et al. ( 2013 ) This research discusses collaborative learning techniques and adapted it for the introductory database course at the Barcelona School of Informatics. The motive of the authors was to introduce active learning methods to improve learning and encourage the acquisition of competence. However, the focus of the study was only on a few methods for teaching the course of database systems, while other important perspectives, including database curricula, and tools for teaching DSE were not discussed in this study.

The above discussion shows that a considerable amount of research work has been conducted in the field of DSE to propose various teaching methods; develop and test different supportive tools, techniques, and strategies; and to improve the curricula for DSE. However, to the best of our knowledge, there is no study that puts all these relevant and pertinent aspects together while also classifying and discussing the supporting methods, and techniques. This review is considerably different from previous studies. Table 1 highlights the differences between this study and other relevant studies in the field of DSE using ✓ and – symbol reflecting "included" and "not included" respectively. Therefore, this study aims to conduct a systematic mapping study on DSE that focuses on compiling, classifying, and discussing the existing work related to pedagogy, supporting tools, and curricula.

Comparison with other related research articles

Study (Mcintyre et al., ) (Myers & Skinner, ) (Beecham et al., ) (Dietrich et al., ) (Regueras et al., ) (Nelson & Fatimazahra, ) (Martin et al., ) (Abbasi et al., ) (Luxton-Reilly et al., ) (Taipalus & Seppänen, ) This article
Focus Database Database Software Engineering Database Database Database Database OOP Programming Data Base Database System
Research Types Classifications
Teaching Methods
Tools to aid teaching -
Curricula considered
Evolution
Year 1995 1997 2008 2008 2009 2015 2013 2017 2018 2020 2022

Research methodology

In order to preserve the principal aim of this study, which is to review the research conducted in the area of database systems education, a piece of advice has been collected from existing methods described in various studies (Elberzhager et al., 2012 ; Keele et al., 2007 ; Mushtaq et al., 2017 ) to search for the relevant papers. Thus, proper research objectives were formulated, and based on them appropriate research questions and search strategy were formulated as shown in Fig.  1 .

Fig. 1

Research objectives

The Following are the research objectives of this study:

To find high quality research work in DSE.

To categorize different aspects of DSE covered by other researchers in the field.

To provide a thorough discussion of the existing work in this study to provide useful information in the form of evolution, teaching guidelines, and future research directions of the instructors.

Research questions

In order to fulfill the research objectives, some relevant research questions have been formulated. These questions along with their motivations have been presented in Table 2 .

Study selection results

No Research questions Motivations
RQ1 What are the developments in DSE with respect to tools, methods, and curriculum?

- Identify focal areas of research in DSE

- Discuss the work done in each area

RQ2 How the research in DSE evolved in past 25 years? - Discuss the focus of research in different time spans while mapping it onto the technological advancement

Search strategy

The Following search string used to find relevant articles to conduct this study. “Database” AND (“System” OR “Management”) AND (“Education*” OR “Train*” OR “Tech*” OR “Learn*” OR “Guide*” OR “Curricul*”).

Articles have been taken from different sources i.e. IEEE, Springer, ACM, Science Direct and other well-known journals and conferences such as Wiley Online Library, PLOS and ArXiv. The planning for search to find the primary study in the field of DSE is a vital task.

Study selection

A total of 29,370 initial studies were found. These articles went through a selection process, and two authors were designated to shortlist the articles based on the defined inclusion criteria as shown in Fig.  2 . Their conflicts were resolved by involving a third author; while the inclusion/exclusion criteria were also refined after resolving the conflicts as shown in Table 3 . Cohen’s Kappa coefficient 0.89 was observed between the two authors who selected the articles, which reflects almost perfect agreement between them (Landis & Koch, 1977 ). While, the number of papers in different stages of the selection process for all involved portals has been presented in Table 4 .

Fig. 2

Selection criteria

IC Inclusion criteria
IC 1 The study related to the database and education
IC 2 The years of research publication must be from 1995 to 2022
IC 3 Only full length papers are included
IC 4 Research papers written in English language are included
EC Exclusion criteria
EC1 Incomplete papers, i.e., presentation, posters or essay
EC2 Research articles without abstract
EC3 Research articles other than English language
EC4 Papers that do not include education as their primary focus
Phase Process Selection stage IEEE Springer ACM Elsevier Others Total
1 Search Search string 500 5312 10,802 5696 7045 29,370
2 Screening Title 153 121 115 133 87 609
3 Screening Abstract 45 23 29 21 40 158
4 Screening Full text 10 1 20 2 37 70

Title based search: Papers that are irrelevant based on their title are manually excluded in the first stage. At this stage, there was a large portion of irrelevant papers. Only 609 papers remained after this stage.

Abstract based search: At this stage, abstracts of the selected papers in the previous stage are studied and the papers are categorized for the analysis along with research approach. After this stage only 152 papers were left.

Full text based analysis: Empirical quality of the selected articles in the previous stage is evaluated at this stage. The analysis of full text of the article has been conducted. The total of 70 papers were extracted from 152 papers for primary study. Following questions are defined for the conduction of final data extraction.

Quality assessment criteria

Following are the criteria used to assess the quality of the selected primary studies. This quality assessment was conducted by two authors as explained above.

The study focuses on curricula, tools, approach, or assessments in DSE, the possible answers were Yes (1), No (0)

The study presents a solution to the problem in DSE, the possible answers to this question were Yes (1), Partially (0.5), No (0)

The study focuses on empirical results, Yes (1), No (0)

The study is published in a well reputed venue that is adjudged through the CORE ranking of conferences, and Scientific Journal Ranking (SJR). The possible answers to this question are given in Table 5 .

Score pattern of publication channels

Channel type Quartile number Score
Journal Quartile Ranking Q1 2
Q2 1.5
Q3 1
Q4 0.5
Other 0
Conference/Workshop/ Symposium/Core Ranking Core A 1.5
Core B 1
Core C 0.5
Other 0

Almost 50.00% of papers had scored more than average and 33.33% of papers had scored between the average range i.e., 2.50–3.50. Some articles with the score below 2.50 have also been included in this study as they present some useful information and were published in education-based journals. Also, these studies discuss important demography and technology based aspects that are directly related to DSE.

Threats to validity

The validity of this study could be influenced by the following factors during the literature of this publication.

Construct validity

In this study this validity identifies the primary study for research (Elberzhager et al., 2012 ). To ensure that many primary studies have been included in this literature two authors have proposed possible search keywords in multiple repetitions. Search string is comprised of different terms related to DS and education. Though, list might be incomplete, count of final papers found can be changed by the alternative terms (Ampatzoglou et al., 2013 ). IEEE digital library, Science direct, ACM digital library, Wiley Online Library, PLOS, ArXiv and Google scholar are the main libraries where search is done. We believe according to the statistics of search engines of literature the most research can be found on these digital libraries (Garousi et al., 2013 ). Researchers also searched related papers in main DS research sites (VLDB, ICDM, EDBT) in order to minimize the risk of missing important publication.

Including the papers that does not belong to top journals or conferences may reduce the quality of primary studies in this research but it indicates that the representativeness of the primary studies is improved. However, certain papers which were not from the top publication sources are included because of their relativeness wisth the literature, even though they reduce the average score for primary studies. It also reduces the possibility of alteration of results which might have caused by the improper handling of duplicate papers. Some cases of duplications were found which were inspected later whether they were the same study or not. The two authors who have conducted the search has taken the final decision to the select the papers. If there is no agreement between then there must be discussion until an agreement is reached.

Internal validity

This validity deals with extraction and data analysis (Elberzhager et al., 2012 ). Two authors carried out the data extraction and primary studies classification. While the conflicts between them were resolved by involving a third author. The Kappa coefficient was 0.89, according to Landis and Koch ( 1977 ), this value indicates almost perfect level of agreement between the authors that reduces this threat significantly.

Conclusion validity

This threat deals with the identification of improper results which may cause the improper conclusions. In this case this threat deals with the factors like missing studies and wrong data extraction (Ampatzoglou et al., 2013 ). The objective of this is to limit these factors so that other authors can perform study and produce the proper conclusions (Elberzhager et al., 2012 ).

Interpretation of results might be affected by the selection and classification of primary studies and analyzing the selected study. Previous section has clearly described each step performed in primary study selection and data extraction activity to minimize this threat. The traceability between the result and data extracted was supported through the different charts. In our point of view, slight difference based on the publication selection and misclassification would not alter the main results.

External validity

This threat deals with the simplification of this research (Mateo et al., 2012 ). The results of this study were only considered that related to the DSE filed and validation of the conclusions extracted from this study only concerns the DSE context. The selected study representativeness was not affected because there was no restriction on time to find the published research. Therefore, this external validity threat is not valid in the context of this research. DS researchers can take search string and the paper classification scheme represented in this study as an initial point and more papers can be searched and categorized according to this scheme.

Analysis of compiled research articles

This section presents the analysis of the compiled research articles carefully selected for this study. It presents the findings with respect to the research questions described in Table 2 .

Selection results

A total of 70 papers were identified and analyzed for the answers of RQs described above. Table 6 represents a list of the nominated papers with detail of the classification results and their quality assessment scores.

Classification and quality assessment of selected articles

Ref Channel Year Research Type a b c d Total
Tools Quality Assessment
(Mcintyre et al., ) Journal 1995 Review 1 1 0 2 4
(Abut & Ozturk, ) Conference 1997 Experiment 1 1 0 0 2
(Yau & Karim, ) Conference 2003 Experiment 1 0.5 0 1 2.5
(Pahl et al., ) Journal 2004 Experiment 1 1 0 0 2
(Connolly et al., ) Conference 2005 Experiment 1 0.5 1 1 3.5
(Regueras et al., ) Conference 2007 Case Study 1 1 1 0 3
(Sciore, ) Symposium 2007 Case Study 1 0 1 1.5 3.5
(Holliday & Wang, ) Conference 2009 Experiment 1 0.5 1 0.5 3
(Brusilovsky et al., ) Journal 2010 Experiment 1 1 1 2 5
(Cvetanovic et al., ) Journal 2010 Experiment 1 1 0 2 4
(Nelson & Fatimazahra, ) Journal 2010 Review 1 1 0 1 3
(Wang et al., ) Conference 2010 Experiment 1 1 0 1.5 3.5
(Nagataki et al., ) Journal 2013 Experiment 0 1 1 2 4
(Yue, ) Journal 2013 Experiment 1 1 1 1.5 4.5
(Abelló Gamazo et al., ) Journal 2016 Experiment 1 1 1 2 5
(Taipalus & Perälä, ) Symposium 2019 Review 1 1 1 1.5 4.5
Methods Quality Assessment
(Dietrich & Urban, ) Conference 1996 Review 1 1 0 1.5 3.5
(Urban & Dietrich, ) Journal 1997 Experiment 1 1 0 0 2
(Nelson et al., ) Workshop 2003 Review 1 1 0 0 2
(Amadio, ) Conference 2003 Experiment 1 0.5 1 0.5 3
(Connolly & Begg, ) Journal 2006 Experiment 1 1 0 2 4
(Morien, ) Journal 2006 Experiment 1 0.5 1 2 4.5
(Prince & Felder, ) Journal 2006 Review 0 0.5 0 2 2.5
(Martinez-González & Duffing, ) Journal 2007 Review 1 1 0 2 4
(Gudivada et al., ) Conference 2007 Review 1 0.5 0 0 1.5
(Svahnberg et al., ) Symposium 2008 Review 1 0 0 1.5 2.5
(Brusilovsky et al., ) Conference 2008 Experiment 1 0.5 1 1.5 4
(Dominguez & Jaime, ) Journal 2010 Experiment 1 1 1 2 5
(Efendiouglu & Yelken ) Journal 2010 Experiment 1 1 1 0 3
(Hou & Chen, ) Conference 2010 Review 1 0.5 1 0 2.5
(Yuelan et al., ) Conference 2011 Experiment 1 0.5 0 0 1.5
(Zheng & Dong, ) Conference 2011 Review 1 1 0 1 3
(Al-Shuaily, ) Workshop 2012 Review 1 1 1 0 3
(Juxiang & Zhihong, ) Conference 2012 Review 1 0.5 0 0 1.5
(Chen et al., ) Journal 2012 Review 1 1 1 2 5
(Martin et al., ) Journal 2013 Review 1 1 1 2 5
(Rashid & Al-Radhy, ) conference 2014 Review 1 0.5 1 0 2.5
(Wang & Chen, ) Conference 2014 Experiment 1 0 1 0 2
(Dicheva et al., ) Journal 2015 Review 1 1 0 1 3
(Rashid, ) Journal 2015 Review 1 0.5 1 2 4.5
(Etemad & Küpçü, ) Journal 2018 Experiment 0 0.5 1 2 3.5
(Kui et al., ) Conference 2018 Experiment 1 1 0 1 3
(Taipalus et al., ) Journal 2018 Review 1 1 0 2 4
(Zhang et al., ) conference 2018 Experiment 1 1 1 0 3
(Shebaro, ) Journal 2018 Review 1 0.5 1 0 2.5
(Cai & Gao, ) Conference 2019 Review 1 1 0 0 2
(Kawash et al., ) Symposium 2020 Experiment 1 1 1 1.5 4.5
(Taipalus & Seppänen, ) Journal 2020 Review 1 1 1 2 5
(Canedo et al., ) Journal 2021 Experiment 1 1 1 1 4
(Naik & Gajjar, ) Journal 2021 Case Study 1 1 1 0 3
(Ko et al., ) Journal 2021 Review 1 1 1 2 5
(Sibia et al., ) Workshop 2022 Case Study 1 1 1 0 3
Curriculum Quality Assessment
(Dean & Milani, ) Conference 1995 Experiment 1 0.5 1 0.5 3
(Urban & Dietrich, ) Symposium 2001 Case Study 1 0 1 1.5 3.5
(Calero et al., ) Journal 2003 Review 1 1 0 2 4
(Robbert & Ricardo, ) Conference 2003 Review 1 1 0 1.5 3.5
(Adams et al., ) Journal 2004 Experiment 1 1 0 0 2
(Conklin & Heinrichs, ) Journal 2005 Review 1 1 1 0 3
(Dietrich et al., ) Journal 2008 Case Study 0 1 1 2 4
(Luo et al., ) Conference 2008 Experiment 1 1 1 0 3
(Marshall, ) Conference 2011 Review 1 1 1 0 3
(Bhogal et al., ) Workshop 2012 Case Study 1 1 0 0 2
(Picciano, ) Journal 2012 Review 1 1 0 0 2
(Abid et al., ) Journal 2015 Review 1 1 1 1 4
(Taipalus & Seppänen, ) Journal 2015 Experiment 1 1 1 2 5
(Abourezq & Idrissi, ) Journal 2016 Experiment 1 1 0 0.5 2.5
(Silva et al., ) Conference 2016 Experiment 1 1 0 1.5 3.5
(Zhanquan et al., ) Journal 2016 Review 1 1 1 0 3
(Mingyu et al., ) Conference 2017 Experiment 1 1 1 0 3
(Andersson et al., ) Conference 2019 Review 1 0.5 0 0 1.5

RQ1.Categorization of research work in DSE field

The analysis in this study reveals that the literature can be categorized as: Tools: any additional application that helps instructors in teaching and students in learning. Methods: any improvisation aimed at improving pedagogy or cognition. Curriculum: refers to the course content domains and their relative importance in a degree program, as shown in Fig.  3 .

Fig. 3

Taxonomy of DSE study types

Most of the articles provide a solution by gathering the data and also prove the novelty of their research through results. These papers are categorized as experiments w.r.t. their research types. Whereas, some of them case study papers which are used to generate an in depth, multifaceted understanding of a complex issue in its real-life context, while few others are review studies analyzing the previously used approaches. On the other hand, a majority of included articles have evaluated their results with the help of experiments, while others conducted reviews to establish an opinion as shown in Fig.  4 .

Fig. 4

Cross Mapping of DSE study type and research Types

Educational tools, especially those related to technology, are making their place in market faster than ever before (Calderon et al., 2011 ). The transition to active learning approaches, with the learner more engaged in the process rather than passively taking in information, necessitates a variety of tools to help ensure success. As with most educational initiatives, time should be taken to consider the goals of the activity, the type of learners, and the tools needed to meet the goals. Constant reassessment of tools is important to discover innovation and reforms that improve teaching and learning (Irby & Wilkerson, 2003 ). For this purpose, various type of educational tools such as, interactive, web-based and game based have been introduced to aid the instructors in order to explain the topic in more effective way.

The inclusion of technology into the classroom may help learners to compete in the competitive market when approaching the start of their career. It is important for the instructors to acknowledge that the students are more interested in using technology to learn database course instead of merely being taught traditional theory, project, and practice-based methods of teaching (Adams et al., 2004 ). Keeping these aspects in view many authors have done significant research which includes web-based and interactive tools to help the learners gain better understanding of basic database concepts.

Great research has been conducted with the focus of students learning. In this study we have discussed the students learning supportive with two major finding’s objectives i.e., tools which prove to be more helpful than other tools. Whereas, proposed tools with same outcome as traditional classroom environment. Such as, Abut and Ozturk ( 1997 ) proposed an interactive classroom environment to conduct database classes. The online tools such as electronic “Whiteboard”, electronic textbooks, advance telecommunication networks and few other resources such as Matlab and World Wide Web were the main highlights of their proposed smart classroom. Also, Pahl et al. ( 2004 ) presented an interactive multimedia-based system for the knowledge and skill oriented Web-based education of database course students. The authors had differentiated their proposed classroom environment from traditional classroom-based approach by using tool mediated independent learning and training in an authentic setting. On the other hand, some authors have also evaluated the educational tools based on their usage and impact on students’ learning. For example, Brusilovsky et al. ( 2010 )s evaluated the technical and conceptual difficulties of using several interactive educational tools in the context of a single course. A combined Exploratorium has been presented for database courses and an experimental platform, which delivers modified access to numerous types of interactive learning activities.

Also, Taipalus and Perälä ( 2019 ) investigated the types of errors that are persistent in writing SQL by the students. The authors also contemplated the errors while mapping them onto different query concepts. Moreover, Abelló Gamazo et al. ( 2016 ) presented a software tool for the e-assessment of relational database skills named LearnSQL. The proposed software allows the automatic and efficient e-learning and e-assessment of relational database skills. Apart from these, Yue ( 2013 ) proposed the database tool named Sakila as a unified platform to support instructions and multiple assignments of a graduate database course for five semesters. According to this study, students find this tool more useful and interesting than the highly simplified databases developed by the instructor, or obtained from textbook. On the other hand, authors have proposed tools with the main objective to help the student’s grip on the topic by addressing the pedagogical problems in using the educational tools. Connolly et al. ( 2005 ) discussed some of the pedagogical problems sustaining the development of a constructive learning environment using problem-based learning, a simulation game and interactive visualizations to help teach database analysis and design. Also, Yau and Karim ( 2003 ) proposed smart classroom with prevalent computing technology which will facilitate collaborative learning among the learners. The major aim of this smart classroom is to improve the quality of interaction between the instructors and students during lecture.

Student satisfaction is also an important factor for the educational tools to more effective. While it supports in students learning process it should also be flexible to achieve the student’s confidence by making it as per student’s needs (Brusilovsky et al., 2010 ; Connolly et al., 2005 ; Pahl et al., 2004 ). Also, Cvetanovic et al. ( 2010 ) has proposed a web-based educational system named ADVICE. The proposed solution helps the students to reduce the gap between DBMS, theory and its practice. On the other hand, authors have enhanced the already existing educational tools in the traditional classroom environment to addressed the student’s concerns (Nelson & Fatimazahra, 2010 ; Regueras et al., 2007 ) Table 7 .

Tools: Adopted in DSE and their impacts

Objective Findings References Target Topic/ exposition platform
Support of Students’ learning More supportive • (Abut & Ozturk, )

• Data models and data modelling principles

• IDLE (the Interactive Database Learning Environment)

• (Pahl et al., )

• Data models

• IDLE

• (Brusilovsky et al., )

• SQL

• SQL-Knot, SQL-Lab

• Conceptual database design, Logical database design, Physical database design

• Online games

• SQL

• Interactive

• (Abbasi et al., )

• Relational Database

• LearnSQL

• (Yue, )

• Relational Calculus, XML generation, XPath, and XQuery

• Sakila

• (Nelson & Fatimazahra, )

• Introductory Database topics

• TLAD

Same as others • (Connolly et al., )

• Conceptual database design, Logical database design, Physical database design

• Online games

• (Yau & Karim, )

• Introductory Database topics

• RCSM

Students’ Satisfaction Satisfied • (Brusilovsky et al., )

• SQL

• SQL-Knot, SQL-Lab

• (Cvetanovic et al., )

• SQL, formal query languages, and normalization

• ADVICE

• (Connolly et al., )
• (Pahl et al., )

• Data models

• IDLE

Similar satisfaction as compared to traditional classroom environment • (Nelson & Fatimazahra, )

• Introductory Database topics

• TLAD

• (Regueras et al., )

• Entity Relationship Model

• QUEST

Students’ motivation towards database development Same impact as other approaches • (Nagataki et al., )

• SQL

• sAccess

Helped students to develop better database development strategies • (Brusilovsky et al., )

• SQL

• SQL-Knot, SQL-Lab

• (Mcintyre et al., )

• Relational Database Design

• Expert IT system

Students’ course performance Better performance • (Cvetanovic et al., )

• SQL, formal query languages, and normalization

• ADVICE

• (Wang et al., )

• Entity Relationship Model, SQL

• MeTube

• (Holliday & Wang, )

• MySQL

• MeTube

• (Taipalus & Perälä, )

• SQL

• Interactive

Same performance as other approaches • (Pahl et al., )

• Data models

• IDLE

• (Yue, )

• Relational Calculus, XML generation, XPath, and XQuery

• Sakila

Student and instructor interaction percentage Increased • (Abut & Ozturk, )

• Introductory Database topics

• “Whiteboard”

• (Yau & Karim, )

• Introductory Database topics

• RCSM

• (Taipalus & Perälä, )

• SQL

• Interactive

Hands on database development is the main concern in most of the institute as well as in industry. However, tools assisting the students in database development and query writing is still major concern especially in SQL (Brusilovsky et al., 2010 ; Nagataki et al., 2013 ).

Student’s grades reflect their conceptual clarity and database development skills. They are also important to secure jobs and scholarships after passing out, which is why it is important to have the educational learning tools to help the students to perform well in the exams (Cvetanovic et al., 2010 ; Taipalus et al., 2018 ). While, few authors (Wang et al., 2010 ) proposed Metube which is a variation of YouTube. Subsequently, existing educational tools needs to be upgraded or replaced by the more suitable assessment oriented interactive tools to attend challenging students needs (Pahl et al., 2004 ; Yuelan et al., 2011 ).

One other objective of developing the educational tools is to increase the interaction between the students and the instructors. In the modern era, almost every institute follows the student centered learning(SCL). In SCL the interaction between students and instructor increases with most of the interaction involves from the students. In order to support SCL the educational based interactive and web-based tools need to assign more roles to students than the instructors (Abbasi et al., 2016 ; Taipalus & Perälä, 2019 ; Yau & Karim, 2003 ).

Theory versus practice is still one of the main issues in DSE teaching methods. The traditional teaching method supports theory first and then the concepts learned in the theoretical lectures implemented in the lab. Whereas, others think that it is better to start by teaching how to write query, which should be followed by teaching the design principles for database, while a limited amount of credit hours are also allocated for the general database theory topics. This part of the article discusses different trends of teaching and learning style along with curriculum and assessments methods discussed in DSE literature.

A variety of teaching methods have been designed, experimented, and evaluated by different researchers (Yuelan et al., 2011 ; Chen et al., 2012 ; Connolly & Begg, 2006 ). Some authors have reformed teaching methods based on the requirements of modern way of delivering lectures such as Yuelan et al. ( 2011 ) reform teaching method by using various approaches e.g. a) Modern ways of education: includes multimedia sound, animation, and simulating the process and working of database systems to motivate and inspire the students. b) Project driven approach: aims to make the students familiar with system operations by implementing a project. c) Strengthening the experimental aspects: to help the students get a strong grip on the basic knowledge of database and also enable them to adopt a self-learning ability. d) Improving the traditional assessment method: the students should turn in their research and development work as the content of the exam, so that they can solve their problem on their own.

The main aim of any teaching method is to make student learn the subject effectively. Student must show interest in order to gain something from the lectures delivered by the instructors. For this, teaching methods should be interactive and interesting enough to develop the interest of the students in the subject. Students can show interest in the subject by asking more relative questions or completing the home task and assignments on time. Authors have proposed few teaching methods to make topic more interesting such as, Chen et al. ( 2012 ) proposed a scaffold concept mapping strategy, which considers a student’s prior knowledge, and provides flexible learning aids (scaffolding and fading) for reading and drawing concept maps. Also, Connolly & Begg (200s6) examined different problems in database analysis and design teaching, and proposed a teaching approach driven by principles found in the constructivist epistemology to overcome these problems. This constructivist approach is based on the cognitive apprenticeship model and project-based learning. Similarly, Domínguez & Jaime ( 2010 ) proposed an active method for database design through practical tasks development in a face-to-face course. They analyzed results of five academic years using quasi experimental. The first three years a traditional strategy was followed and a course management system was used as material repository. On the other hand, Dietrich and Urban ( 1996 ) have described the use of cooperative group learning concepts in support of an undergraduate database management course. They have designed the project deliverables in such a way that students develop skills for database implementation. Similarly, Zhang et al. ( 2018 ) have discussed several effective classroom teaching measures from the aspects of the innovation of teaching content, teaching methods, teaching evaluation and assessment methods. They have practiced the various teaching measures by implementing the database technologies and applications in Qinghai University. Moreover, Hou and Chen ( 2010 ) proposed a new teaching method based on blending learning theory, which merges traditional and constructivist methods. They adopted the method by applying the blending learning theory on Access Database programming course teaching.

Problem solving skills is a key aspect to any type of learning at any age. Student must possess this skill to tackle the hurdles in institute and also in industry. Create mind and innovative students find various and unique ways to solve the daily task which is why they are more likeable to secure good grades and jobs. Authors have been working to introduce teaching methods to develop problem solving skills in the students(Al-Shuaily, 2012 ; Cai & Gao, 2019 ; Martinez-González & Duffing, 2007 ; Gudivada et al., 2007 ). For instance, Al-Shuaily ( 2012 ) has explored four cognitive factors such as i) Novices’ ability in understanding, ii) Novices’ ability to translate, iii) Novice’s ability to write, iv) Novices’ skills that might influence SQL teaching, and learning methods and approaches. Also, Cai and Gao ( 2019 ) have reformed the teaching method in the database course of two higher education institutes in China. Skills and knowledge, innovation ability, and data abstraction were the main objective of their study. Similarly, Martinez-González and Duffing ( 2007 ) analyzed the impact of convergence of European Union (EU) in different universities across Europe. According to their study, these institutes need to restructure their degree program and teaching methodologies. Moreover, Gudivada et al. ( 2007 ) proposed a student’s learning method to work with the large datasets. they have used the Amazon Web Services API and.NET/C# application to extract a subset of the product database to enhance student learning in a relational database course.

On the other hand, authors have also evaluated the traditional teaching methods to enhance the problem-solving skills among the students(Eaglestone & Nunes, 2004 ; Wang & Chen, 2014 ; Efendiouglu & Yelken, 2010 ) Such as, Eaglestone and Nunes ( 2004 ) shared their experiences of delivering a database design course at Sheffield University and discussed some of the issues they faced, regarding teaching, learning and assessments. Likewise, Wang and Chen ( 2014 ) summarized the problems mainly in teaching of the traditional database theory and application. According to the authors the teaching method is outdated and does not focus on the important combination of theory and practice. Moreover, Efendiouglu and Yelken ( 2010 ) investigated the effects of two different methods Programmed Instruction (PI) and Meaningful Learning (ML) on primary school teacher candidates’ academic achievements and attitudes toward computer-based education, and to define their views on these methods. The results show that PI is not favoured for teaching applications because of its behavioural structure Table 8 .

Methods: Teaching approaches adopted in DSE

Objective Findings References Target Topic/ Approach or Method
Develop interest in Subject Students begin to ask more relative questions • (Chen et al., )

• Data modeling, relational databases, database query languages

• Scaffolded Concept

• (Connolly & Begg, )

• Database concepts, Database Analysis and Design, Implementation

• Constructivist-Based Approach

• (Dominguez & Jaime, )

• Database design

• Project-based learning

• (Rashid & Al-Radhy, )

• Database Analysis and Design

• Project based learning, Assessment based learning

• (Yuelan et al., )

• Principles of Database, SQL Server

• Project-driven approach

• (Taipalus & Seppänen, )

• SQL

• Group learning and projects

• (Brusilovsky et al., )

• SQL

• SQL Exploratorium

• (Hou & Chen, )

• Access

• Blending Learning

Same effect as others traditional teaching methods • (Dietrich & Urban, )

• ER Model, Relational Design, SQL

• Teaching and learning strategies

• (Kui et al., )

• E-R model, relational model, SQL

• Flipped Classroom

• (Rashid, )

• Entity Relational Database, Relational Algebra, Normalization,

• Learning and Assessment Methods

• (Zhang et al., )

• Data Models, Physical Data Design

• Project teaching mode, Discussion teaching mode, Demonstrative teaching mode

Develop problem solving skills Students become creative and try new methods to solve tasks • (Al-Shuaily, )

• SQL

• Cognitive task, Comprehension Task

• (Cai & Gao, )

• E-R model, relational model, SQL

• Database Course for Liberal Arts Majors

• (Martin et al., )

• SQL and relational algebra, The relational model, Transaction management

• Collaborative Learning

• (Martinez-González & Duffing, )

• Data Models, Physical Data Design, SQL

• European convergence in higher education

• (Prince & Felder, )

• SQL

• Inductive teaching and learning

• (Urban & Dietrich, )

• Relational database mapping and prototyping, Database system implementation

• cooperative group project based learning

• (Gudivada et al., )

• SQL, Logical design, Physical Design

• Working with large datasets from Amazon

Use same methods as mentioned in books • (Eaglestone & Nunes, )

• SQL, ER Model

• Pedagogical model, teaching and learning strategies

• (Wang et al., )

• SQL Server and Oracle

• Refine Teaching Method

• (Efendiouglu & Yelken )

• SQL

• Programmed instruction and meaningful learning

Motivate students to explore topics through independent study Students begin to read books and internet to enhance their knowledge independently or in groups • (Cai & Gao, )

• SQL, E-R model, relational model

• Database Course for Liberal Arts Majors

• (Kawash et al., )

• SQL, Entity Relationship, Relational model

• Group Exams

• (Martin et al., )

• SQL, Relational Model, UML

• Collaborative Learning

• (Martinez-González & Duffing, )

• SQL, Data Models, Physical Data Design

• European convergence in higher education

• (Amadio, )

• SQL Programming

• Team Learning

Students stick to the course content • (Morien, )

• Entity modeling, relational modelling

• Teaching Reform

• (Eaglestone & Nunes, )

• SQL, ER Model

• Pedagogical model, teaching and learning strategies

• (Zheng & Dong, )

• SQL, ER Model

• Teaching Reform and Practice

Focus on theory and practical Gap Students begin to apply theoretical knowledge on developing database applications • (Al-Shuaily, )

• SQL

• Cognitive task, Comprehension Task

• (Etemad & Küpçü, )

• SQL

• cooperative group project-based learning

• (Svahnberg et al., )

• SQL

• Industrial project-based learning

• (Taipalus et al., )

• SQL

• Group learning and projects

• (Juxiang & Zhihong, )

• SQL, ER Model

• Computational Thinking

• (Connolly & Begg, )

• Database concepts, Database Analysis and Design, Implementation

• Constructivist-Based Approach

• (Rashid & Al-Radhy, )

• Database Analysis and Design

• Project based learning, Assessment based learning

• (Naik & Gajjar, )

• database designing, transaction management, SQL

• ENABLE, Project based learning

Students only focus on theory to clear exams • (Wang et al., )

• SQL Server and Oracle

• Refine Teaching Method

• (Zheng & Dong, )

• SQL, ER Model

• Teaching Reform and Practice

• (Nelson et al., )

• Advanced relational design, UML, data warehousing

• Teaching Methods, Assessment Methods

Students become creative and innovative when the try to study on their own and also from different resources rather than curriculum books only. In the modern era, there are various resources available on both online and offline platforms. Modern teaching methods must emphasize on making the students independent from the curriculum books and educate them to learn independently(Amadio et al., 2003 ; Cai & Gao, 2019 ; Martin et al., 2013 ). Also, in the work of Kawash et al. ( 2020 ) proposed he group study-based learning approach called Graded Group Activities (GGAs). In this method students team up in order to take the exam as a group. On the other hand, few studies have emphasized on course content to prepare students for the final exams such as, Zheng and Dong ( 2011 ) have discussed the issues of computer science teaching with particular focus on database systems, where different characteristics of the course, teaching content and suggestions to teach this course effectively have been presented.

As technology is evolving at rapid speed, so students need to have practical experience from the start. Basic theoretical concepts of database are important but they are of no use without its implementation in real world projects. Most of the students study in the institutes with the aim of only clearing the exams with the help of theoretical knowledge and very few students want to have practical experience(Wang & Chen, 2014 ; Zheng & Dong, 2011 ). To reduce the gap between the theory and its implementation, authors have proposed teaching methods to develop the student’s interest in the real-world projects (Naik & Gajjar, 2021 ; Svahnberg et al., 2008 ; Taipalus et al., 2018 ). Moreover, Juxiang and Zhihong ( 2012 ) have proposed that the teaching organization starts from application scenarios, and associate database theoretical knowledge with the process from analysis, modeling to establishing database application. Also, Svahnberg et al. ( 2008 ) explained that in particular conditions, there is a possibility to use students as subjects for experimental studies in DSE and influencing them by providing responses that are in line with industrial practice.

On the other hand, Nelson et al. ( 2003 ) evaluated the different teaching methods used to teach different modules of database in the School of Computing and Technology at the University of Sunder- land. They outlined suggestions for changes to the database curriculum to further integrate research and state-of-the-art systems in databases.

Database curriculum has been revisited many times in the form of guidelines that not only present the contents but also suggest approximate time to cover different topics. According to the ACM curriculum guidelines (Lunt et al., 2008 ) for the undergraduate programs in computer science, the overall coverage time for this course is 46.50 h distributed in such a way that 11 h is the total coverage time for the core topics such as, Information Models (4 core hours), Database Systems (3 core hours) and Data Modeling (4 course hours). Whereas, the remaining hours are allocated for elective topics such as Indexing, Relational Databases, Query Languages, Relational Database Design, Transaction Processing, Distributed Databases, Physical Database Design, Data Mining, Information Storage and Retrieval, Hypermedia, Multimedia Systems, and Digital Libraries(Marshall, 2012 ). While, according to the ACM curriculum guidelines ( 2013 ) for undergraduate programs in computer science, this course should be completed in 15 weeks with two and half hour lecture per week and lab session of four hours per week on average (Brady et al., 2004 ). Thus, the revised version emphasizes on the practice based learning with the help of lab component. Numerous organizations have exerted efforts in this field to classify DSE (Dietrich et al., 2008 ). DSE model curricula, bodies of knowledge (BOKs), and some standardization aspects in this field are discussed below:

Model curricula

There are standard bodies who set the curriculum guidelines for teaching undergraduate degree programs in computing disciplines. Curricula which include the guidelines to teach database are: Computer Engineering Curricula (CEC) (Meier et al., 2008 ), Information Technology Curricula (ITC) (Alrumaih, 2016 ), Computing Curriculum Software Engineering (CCSE) (Meyer, 2001 ), Cyber Security Curricula (CSC) (Brady et al., 2004 ; Bishop et al., 2017 ).

Bodies of knowledge (BOK)

A BOK includes the set of thoughts and activities related to the professional area, while in model curriculum set of guidelines are given to address the education issues (Sahami et al., 2011 ). Database body of Knowledge comprises of (a) The Data Management Body of Knowledge (DM- BOK), (b) Software Engineering Education Knowledge (SEEK) (Sobel, 2003 ) (Sobel, 2003 ), and (c) The SE body of knowledge (SWEBOK) (Swebok Evolution: IEEE Computer Society n.d. ).

Apart from the model curricula, and bodies of knowledge, there also exist some standards related to the database and its different modules: ISO/IEC 9075–1:2016 (Computing Curricula, 1991 ), ISO/IEC 10,026–1: 1998 (Suryn, 2003 ).

We also utilize advices from some studies (Elberzhager et al., 2012 ; Keele et al., 2007 ) to search for relevant papers. In order to conduct this systematic study, it is essential to formulate the primary research questions (Mushtaq et al., 2017 ). Since the data management techniques and software are evolving rapidly, the database curriculum should also be updated accordingly to meet these new requirements. Some authors have described ways of updating the content of courses to keep pace with specific developments in the field and others have developed new database curricula to keep up with the new data management techniques.

Furthermore, some authors have suggested updates for the database curriculum based on the continuously evolving technology and introduction of big data. For instance Bhogal et al. ( 2012 ) have shown that database curricula need to be updated and modernized, which can be achieved by extending the current database concepts that cover the strategies to handle the ever changing user requirements and how database technology has evolved to meet the requirements. Likewise, Picciano ( 2012 ) examines the evolving world of big data and analytics in American higher education. According to the author, the “data driven” decision making method should be used to help the institutes evaluate strategies that can improve retention and update the curriculum that has big data basic concepts and applications, since data driven decision making has already entered in the big data and learning analytic era. Furthermore, Marshall ( 2011 ) presented the challenges faced when developing a curriculum for a Computer Science degree program in the South African context that is earmarked for international recognition. According to the author, the Curricula needs to adhere both to the policy and content requirements in order to be rated as being of a particular quality.

Similarly, some studies (Abourezq & Idrissi, 2016 ; Mingyu et al., 2017 ) described big data influence from a social perspective and also proceeded with the gaps in database curriculum of computer science, especially, in the big data era and discovers the teaching improvements in practical and theoretical teaching mode, teaching content and teaching practice platform in database curriculum. Also Silva et al. ( 2016 ) propose teaching SQL as a general language that can be used in a wide range of database systems from traditional relational database management systems to big data systems.

On the other hand, different authors have developed a database curriculum based on the different academic background of students. Such as, Dean and Milani ( 1995 ) have recommended changes in computer science curricula based on the practice in United Stated Military Academy (USMA). They emphasized greatly on the practical demonstration of the topic rather than the theoretical explanation. Especially, for the non-computer science major students. Furthermore, Urban and Dietrich ( 2001 ) described the development of a second course on database systems for undergraduates, preparing students for the advanced database concepts that they will exercise in the industry. They also shared their experience with teaching the course, elaborating on the topics and assignments. Also, Andersson et al. ( 2019 ) proposed variations in core topics of database management course for the students with the engineering background. Moreover, Dietrich et al. ( 2014 ) described two animations developed with images and color that visually and dynamically introduce fundamental relational database concepts and querying to students of many majors. The goal is that the educators, in diverse academic disciplines, should be able to incorporate these animations in their existing courses to meet their pedagogical needs.

The information systems have evolved into large scale distributed systems that store and process a huge amount of data across different servers, and process them using different distributed data processing frameworks. This evolution has given birth to new paradigms in database systems domain termed as NoSQL and Big Data systems, which significantly deviate from conventional relational and distributed database management systems. It is pertinent to mention that in order to offer a sustainable and practical CS education, these new paradigms and methodologies as shown in Fig.  5 should be included into database education (Kleiner, 2015 ). Tables 9 and 10 shows the summarized findings of the curriculum based reviewed studies. This section also proposed appropriate text book based on the theory, project, and practice-based teaching methodology as shown in Table 9 . The proposed books are selected purely on the bases of their usage in top universities around the world such as, Massachusetts Institute of Technology, Stanford University, Harvard University, University of Oxford, University of Cambridge and, University of Singapore and the coverage of core topics mentioned in the database curriculum.

Fig. 5

Concepts in Database Systems Education (Kleiner, 2015 )

Recommended text books for DSE

Methodology Book title Author(s) Edition Year
Theory Database Management Systems Ramakrishnan, Raghu, and Johannes Gehrke 3 2002
Database Systems: The Complete Book Garcia-Molina, Ullman and Widom 2 2008
Introduction to Database Systems C. J. Date Addison-Wesley 8 2003
Introduction to Database Systems S. Bressan and B. Catania 1 2005
Database system concepts Silberschatz, A., Korth, H.F. and Sudarshan, S 7 2019
A first course in database systems Ullman, J. and Widom, J 3 2007
Project Modern Database Management Jeffrey A. Hoffer, Ramesh Venkataraman and HeikkiTopi 12 2015
Database Systems: A Practical Approach to Design, Implementation, and Management Thomas M. Connolly,Carolyn E. Begg 6 2015
Practice Fundamentals of SQL Programming R. A. Mata-Toledo and P. Cushman. Schaum’s 1 2000
Readings in Database Systems (The Red Book) Hellerstein, Joseph, and Michael Stonebraker 4 2005

Curriculum: Findings of Reviewed Literature

Objective Findings References Topic(s)/ Curricula Standard bodies
Recommendations and revisions Proposed variations based on the scope in the region • (Abourezq & Idrissi, )

• Big Data, SQL

• Computer Science Curricula

• CS 2008
• (Bhogal et al., )

• Big Data

• Computer Science/Engineering Curriculum

• CS 2008/CE 2004
• (Mingyu et al., )

• Big Data, NoSQL

• Computer Science Curricula

• CS 2013
• (Picciano, )

• Big Data

• Computer Science Curricula

• CS 2008
• (Silva et al., )

• Big Data, MapReduce, NoSQL

• and NewSQL

• Computer Science Curricula

• CS 2013
• (Calero et al., )

• Database Design, Database Administration, Database Application

• SWEBOK, DBBOK

• N/A
• (Conklin & Heinrichs, )

• Database theory and database practice

• Computer Science Curricula

• IS 2002

• CC2001

• CC2004

• (Zhanquan et al., )

• Database principles design

• Coursera, Udacity, edX

• N/A
• (Robbert & Ricardo, )

• Data Models, Physical Data Design, SQL

• Computer Science Curricula

• CC 2001
• (Luo et al., )

• SQL Server and Oracle

• Computer Science Curricula

• CC 2004
• (Dietrich & Urban, )

• Object oriented database (OODB) systems; object relational database (ORDB) systems

• Curriculum and Laboratory Improvement Educational Materials Development (CCLI EMD)

• N/A
• (Marshall, )

• Data Models, Physical Data Design, Database Schema and Design, SQL

• CS-BoK

• N/A
Proposed variations based on the educational background of the students • (Dean & Milani, )

• SQL

• Computer Science Curricula

• ACM/IEEE Computing Curricula
• (Dietrich et al., )

• Relational Databases

• Computer Science Curricula

• CC 2008
• (Urban & Dietrich, )

• Relational algebra, Relational calculus, and SQL

• Engineering Curriculum 2000

• CC 2001
• (Andersson et al., )

• ER Model, Relational Model, SQL

• Engineering Curriculum

• CE 2000
Relating Curriculum to assessment Proposed variations based on the assessment methods • (Abid et al., )

• Data Models, Physical Data Design, Database Schema and Design, SQL

• Computer Science Curricula

• CS 2008
• (Adams et al., )

• ER, EER, and UML

• Computer Science Curricula

• CC 2001

RQ.2 Evolution of DSE research

This section discusses the evolution of database while focusing the DSE over the past 25 years as shown in Fig.  6 .

Fig. 6

Evolution of DSE studies

This study shows that there is significant increase in research in DSE after 2004 with 78% of the selected papers are published after 2004. The main reason of this outcome is that some of the papers are published in well-recognized channels like IEEE Transactions on Education, ACM Transactions on Computing Education, International Conference on Computer Science and Education (ICCSE), and Teaching, Learning and Assessment of Database (TLAD) workshop. It is also evident that several of these papers were published before 2004 and only a few articles were published during late 1990s. This is because of the fact that DSE started to gain interest after the introduction of Body of Knowledge and DSE standards. The data intensive scientific discovery has been discussed as the fourth paradigm (Hey et al., 2009 ): where the first involves empirical science and observations; second contains theoretical science and mathematically driven insights; third considers computational science and simulation driven insights; while the fourth involves data driven insights of modern scientific research.

Over the past few decades, students have gone from attending one-room class to having the world at their fingertips, and it is a great challenge for the instructors to develop the interest of students in learning database. This challenge has led to the development of the different types of interactive tools to help the instructors teach DSE in this technology oriented era. Keeping the importance of interactive tools in DSE in perspective, various authors have proposed different interactive tools over the years, such as during 1995–2003, when different authors proposed various interactive tools. Some studies (Abut & Ozturk, 1997 ; Mcintyre et al., 1995 ) introduced state of the art interactive tools to teach and enhance the collaborative learning among the students. Similarly, during 2004–2005 more interactive tools in the field of DSE were proposed such as Pahl et al. ( 2004 ), Connolly et al. ( 2005 ) introduced multimedia system based interactive model and game based collaborative learning environment.

The Internet has started to become more common in the first decade of the twenty-first century and its positive impact on the education sector was undeniable. Cost effective, student teacher peer interaction, keeping in touch with the latest information were the main reasons which made the instructors employ web-based tools to teach database in the education sector. Due to this spike in the demand of web-based tools, authors also started to introduce new instruments to assist with teaching database. In 2007 Regueras et al. ( 2007 ) proposed an e-learning tool named QUEST with a feedback module to help the students to learn from their mistakes. Similarly, in 2010, multiple authors have proposed and evaluated various web-based tools. Cvetanovic et al. ( 2010 ) proposed ADVICE with the functionality to monitor student’s progress, while, few authors (Wang et al., 2010 ) proposed Metube which is a variation of YouTube. Furthermore, Nelson and Fatimazahra ( 2010 ) evaluated different web-based tools to highlight the complexities of using these web-based instruments.

Technology has changed the teaching methods in the education sector but technology cannot replace teachers, and despite the amount of time most students spend online, virtual learning will never recreate the teacher-student bond. In the modern era, innovation in technology used in educational sectors is not meant to replace the instructors or teaching methods.

During the 1990s some studies (Dietrich & Urban, 1996 ; Urban & Dietrich, 1997 ) proposed learning and teaching methods respectively keeping the evolving technology in view. The highlight of their work was project deliverables and assignments where students progressively advanced to a step-by-step extension, from a tutorial exercise and then attempting more difficult extension of assignment.

During 2002–2007 various authors have discussed a number of teaching and learning methods to keep up the pace with the ever changing database technology, such as Connolly and Begg ( 2006 ) proposing a constructive approach to teach database analysis and design. Similarly, Prince and Felder ( 2006 ) reviewed the effectiveness of inquiry learning, problem based learning, project-based learning, case-based teaching, discovery learning, and just-in-time teaching. Also, McIntyre et al. (Mcintyre et al., 1995 ) brought to light the impact of convergence of European Union (EU) in different universities across Europe. They suggested a reconstruction of teaching and learning methodologies in order to effectively teach database.

During 2008–2013 more work had been done to address the different methods of teaching and learning in the field of DSE, like the work of Dominguez and Jaime ( 2010 ) who proposed an active learning approach. The focus of their study was to develop the interest of students in designing and developing databases. Also, Zheng and Dong ( 2011 ) have highlighted various characteristics of the database course and its teaching content. Similarly, Yuelan et al. ( 2011 ) have reformed database teaching methods. The main focus of their study were the Modern ways of education, project driven approach, strengthening the experimental aspects, and improving the traditional assessment method. Likewise, Al-Shuaily ( 2012 ) has explored 4 cognitive factors that can affect the learning process of database. The main focus of their study was to facilitate the students in learning SQL. Subsequently, Chen et al. ( 2012 ) also proposed scaffolding-based concept mapping strategy. This strategy helps the students to better understand database management courses. Correspondingly, Martin et al. ( 2013 ) discussed various collaborative learning techniques in the field of DSE while keeping database as an introductory course.

In the years between 2014 and 2021, research in the field of DSE increased, which was the main reason that the most of teaching, learning and assessment methods were proposed and discussed during this period. Rashid and Al-Radhy ( 2014 ) discussed the issues of traditional teaching, learning, assessing methods of database courses at different universities in Kurdistan and the main focus of their study being reformation issues, such as absence of teaching determination and contradiction between content and theory. Similarly, Wang and Chen ( 2014 ) summarized the main problems in teaching the traditional database theory and its application. Curriculum assessment mode was the main focus of their study. Eaglestone and Nunes ( 2004 ) shared their experiences of delivering a databases design course at Sheffield University. Their focus of study included was to teach the database design module to a diverse group of students from different backgrounds. Rashid ( 2015 ) discussed some important features of database courses, whereby reforming the conventional teaching, learning, and assessing strategies of database courses at universities were the main focus of this study. Kui et al. ( 2018 ) reformed the teaching mode of database courses based on flipped classroom. Initiative learning of database courses was their main focus in this study. Similarly, Zhang et al. ( 2018 ) discussed several effective classroom teaching measures. The main focus of their study was teaching content, teaching methods, teaching evaluation and assessment methods. Cai and Gao ( 2019 ) also carried out the teaching reforms in the database course of liberal arts. Diversified teaching modes, such as flipping classroom, case oriented teaching and task oriented were the focus of their study. Teaching Kawash et al. ( 2020 ) proposed a learning approach called Graded Group Activities (GGAs). Their main focus of the study was reforming learning and assessment method.

Database course covers several topics that range from data modeling to data implementation and examination. Over the years, various authors have given their suggestions to update these topics in database curriculum to meet the requirements of modern technologies. On the other hand, authors have also proposed a new curriculum for the students of different academic backgrounds and different areas. These reformations in curriculum helped the students in their preparation, practically and theoretically, and enabled them to compete in the competitive market after graduation.

During 2003 and 2006 authors have proposed various suggestions to update and develop computer science curriculum across different universities. Robbert and Ricardo ( 2003 ) evaluated three reviews from 1999 to 2002 that were given to the groups of educators. The focus of their study was to highlight the trends that occurred in database curriculum. Also, Calero et al. ( 2003 ) proposed a first draft for this Database Body of Knowledge (DBBOK). Database (DB), Database Design (DBD), Database Administration (DBAd), Database Application (DBAp) and Advance Databases (ADVDB) were the main focus of their study. Furthermore, Conklin and Heinrichs (Conklin & Heinrichs, 2005 ) compared the content included in 13 database textbooks and the main focus of their study was IS 2002, CC2001, and CC2004 model curricula.

The years from 2007 and 2011, authors managed to developed various database curricula, like Luo et al. ( 2008 ) developed curricula in Zhejiang University City College. The aim of their study to nurture students to be qualified computer scientists. Likewise, Dietrich et al. ( 2008 ) proposed the techniques to assess the development of an advanced database course. The purpose behind the addition of an advanced database course at undergraduate level was to prepare the students to respond to industrial requirements. Also, Marshall ( 2011 ) developed a new database curriculum for Computer Science degree program in the South African context.

During 2012 and 2021 various authors suggested updates for the database curriculum such as Bhogal et al. ( 2012 ) who suggested updating and modernizing the database curriculum. Data management and data analytics were the focus of their study. Similarly, Picciano ( 2012 ) examined the curriculum in the higher level of American education. The focus of their study was big data and analytics. Also, Zhanquan et al. ( 2016 ) proposed the design for the course content and teaching methods in the classroom. Massive Open Online Courses (MOOCs) were the focus of their study. Likewise, Mingyu et al. ( 2017 ) suggested updating the database curriculum while keeping new technology concerning the database in perspective. The focus of their study was big data.

The above discussion clearly shows that the SQL is most discussed topic in the literature where more than 25% of the studies have discussed it in the previous decade as shown in Fig.  7 . It is pertinent to mention that other SQL databases such as Oracle, MS access are discussed under the SQL banner (Chen et al., 2012 ; Hou & Chen, 2010 ; Wang & Chen, 2014 ). It is mainly because of its ability to handle data in a relational database management system and direct implementation of database theoretical concepts. Also, other database topics such as transaction management, application programming etc. are also the main highlights of the topics discussed in the literature.

Fig. 7

Evolution of Database topics discussed in literature

Research synthesis, advice for instructors, and way forward

This section presents the synthesized information extracted after reading and analyzing the research articles considered in this study. To this end, it firstly contextualizes the tools and methods to help the instructors find suitable tools and methods for their settings. Similarly, developments in curriculum design have also been discussed. Subsequently, general advice for instructors have been discussed. Lastly, promising future research directions for developing new tools, methods, and for revising the curriculum have also been discussed in this section.

Methods, tools, and curriculum

Methods and tools.

Web-based tools proposed by Cvetanovic et al. ( 2010 ) and Wang et al. ( 2010 ) have been quite useful, as they are growing increasingly pertinent as online mode of education is prevalent all around the globe during COVID-19. On the other hand, interactive tools and smart class room methodology has also been used successfully to develop the interest of students in database class. (Brusilovsky et al., 2010 ; Connolly et al., 2005 ; Pahl et al., 2004 ; Canedo et al., 2021 ; Ko et al., 2021 ).

One of the most promising combination of methodology and tool has been proposed by Cvetanovic et al. ( 2010 ), whereby they developed a tool named ADVICE that helps students learn and implement database concepts while using project centric methodology, while a game based collaborative learning environment was proposed by Connolly et al. ( 2005 ) that involves a methodology comprising of modeling, articulation, feedback, and exploration. As a whole, project centric teaching (Connolly & Begg, 2006 ; Domínguez & Jaime, 2010 ) and teaching database design and problem solving skills Wang and Chen ( 2014 ), are two successful approaches for DSE. Whereas, other studies (Urban & Dietrich, 1997 ) proposed teaching methods that are more inclined towards practicing database concepts. While a topic specific approach has been proposed by Abbasi et al. ( 2016 ), Taipalus et al. ( 2018 ) and Silva et al. ( 2016 ) to teach and learn SQL. On the other hand, Cai and Gao ( 2019 ) developed a teaching method for students who do not have a computer science background. Lastly, some useful ways for defining assessments for DSE have been proposed by Kawash et al. ( 2020 ) and Zhang et al. ( 2018 ).

Curriculum of database adopted by various institutes around the world does not address how to teach the database course to the students who do not have a strong computer science background. Such as Marshall ( 2012 ), Luo et al. ( 2008 ) and Zhanquan et al. ( 2016 ) have proposed the updates in current database curriculum for the students who are not from computer science background. While Abid et al. ( 2015 ) proposed a combined course content and various methodologies that can be used for teaching database systems course. On the other hand, current database curriculum does not include the topics related to latest technologies in database domain. This factor was discussed by many other studies as well (Bhogal et al., 2012 ; Mehmood et al., 2020 ; Picciano, 2012 ).

Guidelines for instructors

The major conclusion of this study are the suggestions based on the impact and importance for instructors who are teaching DSE. Furthermore, an overview of productivity of every method can be provided by the empirical studies. These instructions are for instructors which are the focal audience of this study. These suggestions are subjective opinions after literature analysis in form of guidelines according to the authors and their meaning and purpose were maintained. According to the literature reviewed, various issues have been found in this section. Some other issues were also found, but those were not relevant to DSE. Following are some suggestions that provide interesting information:

Project centric and applied approach

To inculcate database development skills for the students, basic elements of database development need to be incorporated into teaching and learning at all levels including undergraduate studies (Bakar et al., 2011 ). To fulfill this objective, instructors should also improve the data quality in DSE by assigning the projects and assignments to the students where they can assess, measure and improve the data quality using already deployed databases. They should demonstrate that the quality of data is determined not only by the effective design of a database, but also through the perception of the end user (Mathieu & Khalil, 1997 )

The gap between the database course theory and industrial practice is big. Fresh graduate students find it difficult to cope up with the industrial pressure because of the contrast between what they have been taught in institutes and its application in industry (Allsopp et al., 2006 ). Involve top performers from classes in industrial projects so that they are able to acquiring sufficient knowledge and practice, especially for post graduate courses. There must be some other activities in which industry practitioners come and present the real projects and also share their industrial experiences with the students. The gap between theoretical and the practical sides of database has been identified by Myers and Skinner ( 1997 ). In order to build practical DS concepts, instructors should provide the students an accurate view of reality and proper tools.

Importance of software development standards and impact of DB in software success

They should have the strategies, ability and skills that can align the DSE course with the contemporary Global Software Development (GSD) (Akbar & Safdar, 2015 ; Damian et al., 2006 ).

Enable the students to explain the approaches to problem solving, development tools and methodologies. Also, the DS courses are usually taught in normal lecture format. The result of this method is that students cannot see the influence on the success or failure of projects because they do not realize the importance of DS activities.

Pedagogy and the use of education technology

Some studies have shown that teaching through play and practical activities helps to improve the knowledge and learning outcome of students (Dicheva et al., 2015 ).

Interactive classrooms can help the instructors to deliver their lecture in a more effective way by using virtual white board, digital textbooks, and data over network(Abut & Ozturk, 1997 ). We suggest that in order to follow the new concept of smart classroom, instructors should use the experience of Yau and Karim ( 2003 ) which benefits in cooperative learning among students and can also be adopted in DSE.

The instructors also need to update themselves with full spectrum of technology in education, in general, and for DSE, in particular. This is becoming more imperative as during COVID the world is relying strongly on the use of technology, particularly in education sector.

Periodic Curriculum Revision

There is also a need to revisit the existing series of courses periodically, so that they are able to offer the following benefits: (a) include the modern day database system concepts; (b) can be offered as a specialization track; (c) a specialized undergraduate degree program may also be designed.

DSE: Way forward

This research combines a significant work done on DSE at one place, thus providing a point to find better ways forward in order to improvise different possible dimensions for improving the teaching process of a database system course in future. This section discusses technology, methods, and modifications in curriculum would most impact the delivery of lectures in coming years.

Several tools have already been developed for effective teaching and learning in database systems. However, there is a great room for developing new tools. Recent rise of the notion of “serious games” is marking its success in several domains. Majority of the research work discussed in this review revolves around web-based tools. The success of serious games invites researchers to explore this new paradigm of developing useful tools for learning and practice database systems concepts.

Likewise, due to COVID-19 the world is setting up new norms, which are expected to affect the methods of teaching as well. This invites the researchers to design, develop, and test flexible tools for online teaching in a more interactive manner. At the same time, it is also imperative to devise new techniques for assessments, especially conducting online exams at massive scale. Moreover, the researchers can implement the idea of instructional design in web-based teaching in which an online classroom can be designed around the learners’ unique backgrounds and effectively delivering the concepts that are considered to be highly important by the instructors.

The teaching, learning and assessment methods discussed in this study can help the instructors to improve their methods in order to teach the database system course in a better way. It is noticed that only 16% of authors have the assessment methods as their focus of study, which clearly highlights that there is still plenty of work needed to be done in this particular domain. Assessment techniques in the database course will help the learners to learn from their mistakes. Also, instructors must realize that there is a massive gap between database theory and practice which can only be reduced with maximum practice and real world database projects.

Similarly, the technology is continuously influencing the development and expansion of modern education, whereas the instructors’ abilities to teach using online platforms are critical to the quality of online education.

In the same way, the ideas like flipped classroom in which students have to prepare the lesson prior to the class can be implemented on web-based teaching. This ensures that the class time can be used for further discussion of the lesson, share ideas and allow students to interact in a dynamic learning environment.

The increasing impact of big data systems, and data science and its anticipated impact on the job market invites the researchers to revisit the fundamental course of database systems as well. There is a need to extend the boundaries of existing contents by including the concepts related to distributed big data systems data storage, processing, and transaction management, with possible glimpse of modern tools and technologies.

As a whole, an interesting and long term extension is to establish a generic and comprehensive framework that engages all the stakeholders with the support of technology to make the teaching, learning, practicing, and assessing easier and more effective.

This SLR presents review on the research work published in the area of database system education, with particular focus on teaching the first course in database systems. The study was carried out by systematically selecting research papers published between 1995 and 2021. Based on the study, a high level categorization presents a taxonomy of the published under the heads of Tools, Methods, and Curriculum. All the selected articles were evaluated on the basis of a quality criteria. Several methods have been developed to effectively teach the database course. These methods focus on improving learning experience, improve student satisfaction, improve students’ course performance, or support the instructors. Similarly, many tools have been developed, whereby some tools are topic based, while others are general purpose tools that apply for whole course. Similarly, the curriculum development activities have also been discussed, where some guidelines provided by ACM/IEEE along with certain standards have been discussed. Apart from this, the evolution in these three areas has also been presented which shows that the researchers have been presenting many different teaching methods throughout the selected period; however, there is a decrease in research articles that address the curriculum and tools in the past five years. Besides, some guidelines for the instructors have also been shared. Also, this SLR proposes a way forward in DSE by emphasizing on the tools: that need to be developed to facilitate instructors and students especially post Covid-19 era, methods: to be adopted by the instructors to close the gap between the theory and practical, Database curricula update after the introduction of emerging technologies such as big data and data science. We also urge that the recognized publication venues for database research including VLDB, ICDM, EDBT should also consider publishing articles related to DSE. The study also highlights the importance of reviving the curricula, tools, and methodologies to cater for recent advancements in the field of database systems.

Data availability

Not Applicable.

Code availability

Declarations, conflict of interest.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • Abbasi, S., Kazi, H., Khowaja, K., Abelló Gamazo, A., Burgués Illa, X., Casany Guerrero, M. J., Martin Escofet, C., Quer, C., Rodriguez González, M. E., Romero Moral, Ó., Urpi Tubella, A., Abid, A., Farooq, M. S., Raza, I., Farooq, U., Abid, K., Hussain, N., Abid, K., Ahmad, F., …, Yatim, N. F. M. (2016). Research trends in enterprise service bus (ESB) applications: A systematic mapping study. Journal of Informetrics, 27 (1), 217–220.
  • Abbasi, S., Kazi, H., & Khowaja, K. (2017). A systematic review of learning object oriented programming through serious games and programming approaches. 2017 4th IEEE International Conference on Engineering Technologies and Applied Sciences (ICETAS) , 1–6.
  • Abelló Gamazo A, Burgués Illa X, Casany Guerrero MJ, Martin Escofet C, Quer C, Rodriguez González ME, Romero Moral Ó, Urpi Tubella A. A software tool for E-assessment of relational database skills. International Journal of Engineering Education. 2016;32(3A):1289–1312. [ Google Scholar ]
  • Abid A, Farooq MS, Raza I, Farooq U, Abid K. Variants of teaching first course in database systems. Bulletin of Education and Research. 2015;37(2):9–25. [ Google Scholar ]
  • Abid A, Hussain N, Abid K, Ahmad F, Farooq MS, Farooq U, Khan SA, Khan YD, Naeem MA, Sabir N. A survey on search results diversification techniques. Neural Computing and Applications. 2016;27(5):1207–1229. [ Google Scholar ]
  • Abourezq, M., & Idrissi, A. (2016). Database-as-a-service for big data: An overview. International Journal of Advanced Computer Science and Applications (IJACSA) , 7 (1).
  • Abut, H., & Ozturk, Y. (1997). Interactive classroom for DSP/communication courses. 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing , 1 , 15–18.
  • Adams ES, Granger M, Goelman D, Ricardo C. Managing the introductory database course: What goes in and what comes out? ACM SIGCSE Bulletin. 2004;36(1):497–498. [ Google Scholar ]
  • Akbar, R., & Safdar, S. (2015). A short review of global software development (gsd) and latest software development trends. 2015 International Conference on Computer, Communications, and Control Technology (I4CT) , 314–317.
  • Allsopp DH, DeMarie D, Alvarez-McHatton P, Doone E. Bridging the gap between theory and practice: Connecting courses with field experiences. Teacher Education Quarterly. 2006;33(1):19–35. [ Google Scholar ]
  • Alrumaih, H. (2016). ACM/IEEE-CS information technology curriculum 2017: status report. Proceedings of the 1st National Computing Colleges Conference (NC3 2016) .
  • Al-Shuaily, H. (2012). Analyzing the influence of SQL teaching and learning methods and approaches. 10 Th International Workshop on the Teaching, Learning and Assessment of Databases , 3.
  • Amadio, W., Riyami, B., Mansouri, K., Poirier, F., Ramzan, M., Abid, A., Khan, H. U., Awan, S. M., Ismail, A., Ahmed, M., Ilyas, M., Mahmood, A., Hey, A. J. G., Tansley, S., Tolle, K. M., others, Tehseen, R., Farooq, M. S., Abid, A., …, Fatimazahra, E. (2003). The fourth paradigm: data-intensive scientific discovery. Innovation in Teaching and Learning in Information and Computer Sciences , 1 (1), 823–828. https://www.iso.org/standard/27614.html
  • Amadio, W. (2003). The dilemma of Team Learning: An assessment from the SQL programming classroom . 823–828.
  • Ampatzoglou A, Charalampidou S, Stamelos I. Research state of the art on GoF design patterns: A mapping study. Journal of Systems and Software. 2013;86(7):1945–1964. [ Google Scholar ]
  • Andersson C, Kroisandt G, Logofatu D. Including active learning in an online database management course for industrial engineering students. IEEE Global Engineering Education Conference (EDUCON) 2019;2019:217–220. [ Google Scholar ]
  • Aria M, Cuccurullo C. bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics. 2017;11(4):959–975. [ Google Scholar ]
  • Aziz O, Farooq MS, Abid A, Saher R, Aslam N. Research trends in enterprise service bus (ESB) applications: A systematic mapping study. IEEE Access. 2020;8:31180–31197. [ Google Scholar ]
  • Bakar MA, Jailani N, Shukur Z, Yatim NFM. Final year supervision management system as a tool for monitoring computer science projects. Procedia-Social and Behavioral Sciences. 2011;18:273–281. [ Google Scholar ]
  • Beecham S, Baddoo N, Hall T, Robinson H, Sharp H. Motivation in Software Engineering: A systematic literature review. Information and Software Technology. 2008;50(9–10):860–878. [ Google Scholar ]
  • Bhogal, J. K., Cox, S., & Maitland, K. (2012). Roadmap for Modernizing Database Curricula. 10 Th International Workshop on the Teaching, Learning and Assessment of Databases , 73.
  • Bishop, M., Burley, D., Buck, S., Ekstrom, J. J., Futcher, L., Gibson, D., ... & Parrish, A. (2017, May). Cybersecurity curricular guidelines . In IFIP World Conference on Information Security Education (pp. 3–13). Cham: Springer.
  • Brady A, Bruce K, Noonan R, Tucker A, Walker H. The 2003 model curriculum for a liberal arts degree in computer science: preliminary report. ACM SIGCSE Bulletin. 2004;36(1):282–283. [ Google Scholar ]
  • Brusilovsky P, Sosnovsky S, Lee DH, Yudelson M, Zadorozhny V, Zhou X. An open integrated exploratorium for database courses. AcM SIGcSE Bulletin. 2008;40(3):22–26. [ Google Scholar ]
  • Brusilovsky P, Sosnovsky S, Yudelson MV, Lee DH, Zadorozhny V, Zhou X. Learning SQL programming with interactive tools: From integration to personalization. ACM Transactions on Computing Education (TOCE) 2010;9(4):1–15. [ Google Scholar ]
  • Cai, Y., & Gao, T. (2019). Teaching Reform in Database Course for Liberal Arts Majors under the Background of" Internet Plus". 2018 6th International Education, Economics, Social Science, Arts, Sports and Management Engineering Conference (IEESASM 2018) , 208–213.
  • Calderon KR, Vij RS, Mattana J, Jhaveri KD. Innovative teaching tools in nephrology. Kidney International. 2011;79(8):797–799. doi: 10.1038/ki.2011.13. [ DOI ] [ PubMed ] [ Google Scholar ]
  • Calero C, Piattini M, Ruiz F. Towards a database body of knowledge: A study from Spain. ACM SIGMOD Record. 2003;32(2):48–53. [ Google Scholar ]
  • Canedo, E. D., Bandeira, I. N., & Costa, P. H. T. (2021). Challenges of database systems teaching amidst the Covid-19 pandemic. In 2021 IEEE Frontiers in Education Conference (FIE) (pp. 1–9). IEEE.
  • Chen H-H, Chen Y-J, Chen K-J. The design and effect of a scaffolded concept mapping strategy on learning performance in an undergraduate database course. IEEE Transactions on Education. 2012;56(3):300–307. [ Google Scholar ]
  • Cobo MJ, López-Herrera AG, Herrera-Viedma E, Herrera F. SciMAT: A new science mapping analysis software tool. Journal of the American Society for Information Science and Technology. 2012;63(8):1609–1630. [ Google Scholar ]
  • Conklin M, Heinrichs L. In search of the right database text. Journal of Computing Sciences in Colleges. 2005;21(2):305–312. [ Google Scholar ]
  • Connolly, T. M., & Begg, C. E. (2006). A constructivist-based approach to teaching database analysis and design. Journal of Information Systems Education , 17 (1).
  • Connolly, T. M., Stansfield, M., & McLellan, E. (2005). An online games-based collaborative learning environment to teach database design. Web-Based Education: Proceedings of the Fourth IASTED International Conference(WBE-2005) .
  • Curricula Computing. (1991). Report of the ACM/IEEE-CS Joint Curriculum Task Force. Technical Report . New York: Association for Computing Machinery.
  • Cvetanovic M, Radivojevic Z, Blagojevic V, Bojovic M. ADVICE—Educational system for teaching database courses. IEEE Transactions on Education. 2010;54(3):398–409. [ Google Scholar ]
  • Damian, D., Hadwin, A., & Al-Ani, B. (2006). Instructional design and assessment strategies for teaching global software development: a framework. Proceedings of the 28th International Conference on Software Engineering , 685–690.
  • Dean, T. J., & Milani, W. G. (1995). Transforming a database systems and design course for non computer science majors. Proceedings Frontiers in Education 1995 25th Annual Conference. Engineering Education for the 21st Century , 2 , 4b2--17.
  • Dicheva, D., Dichev, C., Agre, G., & Angelova, G. (2015). Gamification in education: A systematic mapping study. Journal of Educational Technology \& Society , 18 (3), 75–88.
  • Dietrich SW, Urban SD, Haag S. Developing advanced courses for undergraduates: A case study in databases. IEEE Transactions on Education. 2008;51(1):138–144. [ Google Scholar ]
  • Dietrich SW, Goelman D, Borror CM, Crook SM. An animated introduction to relational databases for many majors. IEEE Transactions on Education. 2014;58(2):81–89. [ Google Scholar ]
  • Dietrich, S. W., & Urban, S. D. (1996). Database theory in practice: learning from cooperative group projects. Proceedings of the Twenty-Seventh SIGCSE Technical Symposium on Computer Science Education , 112–116.
  • Dominguez, C., & Jaime, A. (2010). Database design learning: A project-based approach organized through a course management system. Computers \& Education , 55 (3), 1312–1320.
  • Eaglestone, B., & Nunes, M. B. (2004). Pragmatics and practicalities of teaching and learning in the quicksand of database syllabuses. Journal of Innovations in Teaching and Learning for Information and Computer Sciences , 3 (1).
  • Efendiouglu A, Yelken TY. Programmed instruction versus meaningful learning theory in teaching basic structured query language (SQL) in computer lesson. Computers & Education. 2010;55(3):1287–1299. [ Google Scholar ]
  • Elberzhager F, Münch J, Nha VTN. A systematic mapping study on the combination of static and dynamic quality assurance techniques. Information and Software Technology. 2012;54(1):1–15. [ Google Scholar ]
  • Etemad M, Küpçü A. Verifiable database outsourcing supporting join. Journal of Network and Computer Applications. 2018;115:1–19. [ Google Scholar ]
  • Farooq MS, Riaz S, Abid A, Abid K, Naeem MA. A Survey on the role of IoT in agriculture for the implementation of smart farming. IEEE Access. 2019;7:156237–156271. [ Google Scholar ]
  • Farooq MS, Riaz S, Abid A, Umer T, Zikria YB. Role of IoT technology in agriculture: A systematic literature review. Electronics. 2020;9(2):319. [ Google Scholar ]
  • Farooq U, Rahim MSM, Sabir N, Hussain A, Abid A. Advances in machine translation for sign language: Approaches, limitations, and challenges. Neural Computing and Applications. 2021;33(21):14357–14399. [ Google Scholar ]
  • Fisher, D., & Khine, M. S. (2006). Contemporary approaches to research on learning environments: Worldviews . World Scientific.
  • Garcia-Molina, H. (2008). Database systems: the complete book . Pearson Education India.
  • Garousi V, Mesbah A, Betin-Can A, Mirshokraie S. A systematic mapping study of web application testing. Information and Software Technology. 2013;55(8):1374–1396. [ Google Scholar ]
  • Gudivada, V. N., Nandigam, J., & Tao, Y. (2007). Enhancing student learning in database courses with large data sets. 2007 37th Annual Frontiers In Education Conference-Global Engineering: Knowledge Without Borders, Opportunities Without Passports , S2D--13.
  • Hey, A. J. G., Tansley, S., Tolle, K. M., & others. (2009). The fourth paradigm: data-intensive scientific discovery (Vol. 1). Microsoft research Redmond, WA.
  • Holliday, M. A., & Wang, J. Z. (2009). A multimedia database project and the evolution of the database course. 2009 39th IEEE Frontiers in Education Conference , 1–6.
  • Hou, S., & Chen, S. (2010). Research on applying the theory of Blending Learning on Access Database Programming Course teaching. 2010 2nd International Conference on Education Technology and Computer , 3 , V3--396.
  • Irby DM, Wilkerson L. Educational innovations in academic medicine and environmental trends. Journal of General Internal Medicine. 2003;18(5):370–376. doi: 10.1046/j.1525-1497.2003.21049.x. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ishaq K, Zin NAM, Rosdi F, Jehanghir M, Ishaq S, Abid A. Mobile-assisted and gamification-based language learning: A systematic literature review. PeerJ Computer Science. 2021;7:e496. doi: 10.7717/peerj-cs.496. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Joint Task Force on Computing Curricula, A. F. C. M. (acm), & Society, I. C. (2013). Computer science curricula 2013: Curriculum guidelines for undergraduate degree programs in computer science . New York, NY, USA: Association for Computing Machinery.
  • Juxiang R, Zhihong N. Taking database design as trunk line of database courses. Fourth International Conference on Computational and Information Sciences. 2012;2012:767–769. [ Google Scholar ]
  • Kawash, J., Jarada, T., & Moshirpour, M. (2020). Group exams as learning tools: Evidence from an undergraduate database course. Proceedings of the 51st ACM Technical Symposium on Computer Science Education , 626–632.
  • Keele, S., et al. (2007). Guidelines for performing systematic literature reviews in software engineering .
  • Kleiner, C. (2015). New Concepts in Database System Education: Experiences and Ideas. Proceedings of the 46th ACM Technical Symposium on Computer Science Education , 698.
  • Ko J, Paek S, Park S, Park J. A news big data analysis of issues in higher education in Korea amid the COVID-19 pandemic. Sustainability. 2021;13(13):7347. [ Google Scholar ]
  • Kui, X., Du, H., Zhong, P., & Liu, W. (2018). Research and application of flipped classroom in database course. 2018 13th International Conference on Computer Science \& Education (ICCSE) , 1–5.
  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics , 159–174. [ PubMed ]
  • Lunt, B., Ekstrom, J., Gorka, S., Hislop, G., Kamali, R., Lawson, E., ... & Reichgelt, H. (2008). Curriculum guidelines for undergraduate degree programs in information technology . ACM.
  • Luo, R., Wu, M., Zhu, Y., & Shen, Y. (2008). Exploration of Curriculum Structures and Educational Models of Database Applications. 2008 The 9th International Conference for Young Computer Scientists , 2664–2668.
  • Luxton-Reilly, A., Albluwi, I., Becker, B. A., Giannakos, M., Kumar, A. N., Ott, L., Paterson, J., Scott, M. J., Sheard, J., & Szabo, C. (2018). Introductory programming: a systematic literature review. Proceedings Companion of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education , 55–106.
  • Manzoor MF, Abid A, Farooq MS, Nawaz NA, Farooq U. Resource allocation techniques in cloud computing: A review and future directions. Elektronika Ir Elektrotechnika. 2020;26(6):40–51. doi: 10.5755/j01.eie.26.6.25865. [ DOI ] [ Google Scholar ]
  • Marshall, L. (2011). Developing a computer science curriculum in the South African context. CSERC , 9–19.
  • Marshall, L. (2012). A comparison of the core aspects of the acm/ieee computer science curriculum 2013 strawman report with the specified core of cc2001 and cs2008 review. Proceedings of Second Computer Science Education Research Conference , 29–34.
  • Martin C, Urpi T, Casany MJ, Illa XB, Quer C, Rodriguez ME, Abello A. Improving learning in a database course using collaborative learning techniques. The International Journal of Engineering Education. 2013;29(4):986–997. [ Google Scholar ]
  • Martinez-González MM, Duffing G. Teaching databases in compliance with the European dimension of higher education: Best practices for better competences. Education and Information Technologies. 2007;12(4):211–228. [ Google Scholar ]
  • Mateo PR, Usaola MP, Alemán JLF. Validating second-order mutation at system level. IEEE Transactions on Software Engineering. 2012;39(4):570–587. [ Google Scholar ]
  • Mathieu, R. G., & Khalil, O. (1997). Teaching Data Quality in the Undergraduate Database Course. IQ , 249–266.
  • Mcintyre, D. R., Pu, H.-C., & Wolff, F. G. (1995). Use of software tools in teaching relational database design. Computers \& Education , 24 (4), 279–286.
  • Mehmood E, Abid A, Farooq MS, Nawaz NA. Curriculum, teaching and learning, and assessments for introductory programming course. IEEE Access. 2020;8:125961–125981. [ Google Scholar ]
  • Meier, R., Barnicki, S. L., Barnekow, W., & Durant, E. (2008). Work in progress-Year 2 results from a balanced, freshman-first computer engineering curriculum. In 38th Annual Frontiers in Education Conference (pp. S1F-17). IEEE.
  • Meyer B. Software engineering in the academy. Computer. 2001;34(5):28–35. [ Google Scholar ]
  • Mingyu, L., Jianping, J., Yi, Z., & Cuili, Z. (2017). Research on the teaching reform of database curriculum major in computer in big data era. 2017 12th International Conference on Computer Science and Education (ICCSE) , 570–573.
  • Morien, R. I. (2006). A Critical Evaluation Database Textbooks, Curriculum and Educational Outcomes. Director , 7 .
  • Mushtaq Z, Rasool G, Shehzad B. Multilingual source code analysis: A systematic literature review. IEEE Access. 2017;5:11307–11336. [ Google Scholar ]
  • Myers M, Skinner P. The gap between theory and practice: A database application case study. Journal of International Information Management. 1997;6(1):5. [ Google Scholar ]
  • Naeem A, Farooq MS, Khelifi A, Abid A. Malignant melanoma classification using deep learning: Datasets, performance measurements, challenges and opportunities. IEEE Access. 2020;8:110575–110597. [ Google Scholar ]
  • Nagataki, H., Nakano, Y., Nobe, M., Tohyama, T., & Kanemune, S. (2013). A visual learning tool for database operation. Proceedings of the 8th Workshop in Primary and Secondary Computing Education , 39–40.
  • Naik, S., & Gajjar, K. (2021). Applying and Evaluating Engagement and Application-Based Learning and Education (ENABLE): A Student-Centered Learning Pedagogy for the Course Database Management System. Journal of Education , 00220574211032319.
  • Nelson, D., Stirk, S., Patience, S., & Green, C. (2003). An evaluation of a diverse database teaching curriculum and the impact of research. 1st LTSN Workshop on Teaching, Learning and Assessment of Databases, Coventry .
  • Nelson D, Fatimazahra E. Review of Contributions to the Teaching, Learning and Assessment of Databases (TLAD) Workshops. Innovation in Teaching and Learning in Information and Computer Sciences. 2010;9(1):78–86. [ Google Scholar ]
  • Obaid I, Farooq MS, Abid A. Gamification for recruitment and job training: Model, taxonomy, and challenges. IEEE Access. 2020;8:65164–65178. [ Google Scholar ]
  • Pahl C, Barrett R, Kenny C. Supporting active database learning and training through interactive multimedia. ACM SIGCSE Bulletin. 2004;36(3):27–31. [ Google Scholar ]
  • Park, Y., Tajik, A. S., Cafarella, M., & Mozafari, B. (2017). Database learning: Toward a database that becomes smarter every time. Proceedings of the 2017 ACM International Conference on Management of Data , 587–602.
  • Picciano AG. The evolution of big data and learning analytics in American higher education. Journal of Asynchronous Learning Networks. 2012;16(3):9–20. [ Google Scholar ]
  • Prince MJ, Felder RM. Inductive teaching and learning methods: Definitions, comparisons, and research bases. Journal of Engineering Education. 2006;95(2):123–138. [ Google Scholar ]
  • Ramzan M, Abid A, Khan HU, Awan SM, Ismail A, Ahmed M, Ilyas M, Mahmood A. A review on state-of-the-art violence detection techniques. IEEE Access. 2019;7:107560–107575. [ Google Scholar ]
  • Rashid, T. A., & Al-Radhy, R. S. (2014). Transformations to issues in teaching, learning, and assessing methods in databases courses. 2014 IEEE International Conference on Teaching, Assessment and Learning for Engineering (TALE) , 252–256.
  • Rashid, T. (2015). Investigation of instructing reforms in databases. International Journal of Scientific \& Engineering Research , 6 (8), 64–72.
  • Regueras, L. M., Verdú, E., Verdú, M. J., Pérez, M. A., & De Castro, J. P. (2007). E-learning strategies to support databases courses: a case study. First International Conference on Technology, Training and Communication .
  • Robbert MA, Ricardo CM. Trends in the evolution of the database curriculum. ACM SIGCSE Bulletin. 2003;35(3):139–143. [ Google Scholar ]
  • Sahami, M., Guzdial, M., McGettrick, A., & Roach, S. (2011). Setting the stage for computing curricula 2013: computer science--report from the ACM/IEEE-CS joint task force. Proceedings of the 42nd ACM Technical Symposium on Computer Science Education , 161–162.
  • Sciore E. SimpleDB: A simple java-based multiuser syst for teaching database internals. ACM SIGCSE Bulletin. 2007;39(1):561–565. [ Google Scholar ]
  • Shebaro B. Using active learning strategies in teaching introductory database courses. Journal of Computing Sciences in Colleges. 2018;33(4):28–36. [ Google Scholar ]
  • Sibia, N., & Liut, M. (2022, June). The Positive Effects of using Reflective Prompts in a Database Course. In 1st International Workshop on Data Systems Education (pp. 32–37).
  • Silva, Y. N., Almeida, I., & Queiroz, M. (2016). SQL: From traditional databases to big data. Proceedings of the 47th ACM Technical Symposium on Computing Science Education , 413–418.
  • Sobel, A. E. K. (2003). Computing Curricula--Software Engineering Volume. Proc. of the Final Draft of the Software Engineering Education Knowledge (SEEK) .
  • Suryn, W., Abran, A., & April, A. (2003). ISO/IEC SQuaRE: The second generation of standards for software product quality .
  • Svahnberg, M., Aurum, A., & Wohlin, C. (2008). Using students as subjects-an empirical evaluation. Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement , 288–290.
  • Swebok evolution: IEEE Computer Society. (n.d.). In IEEE Computer Society SWEBOK Evolution Comments . Retrieved March 24, 2021 https://www.computer.org/volunteering/boards-and-committees/professional-educational-activities/software-engineering-committee/swebok-evolution
  • Taipalus T, Seppänen V. SQL education: A systematic mapping study and future research agenda. ACM Transactions on Computing Education (TOCE) 2020;20(3):1–33. [ Google Scholar ]
  • Taipalus T, Siponen M, Vartiainen T. Errors and complications in SQL query formulation. ACM Transactions on Computing Education (TOCE) 2018;18(3):1–29. [ Google Scholar ]
  • Taipalus, T., & Perälä, P. (2019). What to expect and what to focus on in SQL query teaching. Proceedings of the 50th ACM Technical Symposium on Computer Science Education , 198–203.
  • Tehseen R, Farooq MS, Abid A. Earthquake prediction using expert systems: A systematic mapping study. Sustainability. 2020;12(6):2420. [ Google Scholar ]
  • Urban, S. D., & Dietrich, S. W. (2001). Advanced database concepts for undergraduates: experience with teaching a second course. Proceedings of the Thirty-Second SIGCSE Technical Symposium on Computer Science Education , 357–361.
  • Urban SD, Dietrich SW. Integrating the practical use of a database product into a theoretical curriculum. ACM SIGCSE Bulletin. 1997;29(1):121–125. [ Google Scholar ]
  • Wang, J., & Chen, H. (2014). Research and practice on the teaching reform of database course. International Conference on Education Reform and Modern Management, ERMM .
  • Wang, J. Z., Davis, T. A., Westall, J. M., & Srimani, P. K. (2010). Undergraduate database instruction with MeTube. Proceedings of the Fifteenth Annual Conference on Innovation and Technology in Computer Science Education , 279–283.
  • Yau, G., & Karim, S. W. (2003). Smart classroom: Enhancing collaborative learning using pervasive computing technology. II American Society… .
  • Yue K-B. Using a semi-realistic database to support a database course. Journal of Information Systems Education. 2013;24(4):327. [ Google Scholar ]
  • Yuelan L, Yiwei L, Yuyan H, Yuefan L. Study on teaching methods of database application courses. Procedia Engineering. 2011;15:5425–5428. [ Google Scholar ]
  • Zhang, X., Wang, X., Liu, Z., Xue, W., & ZHU, X. (2018). The Exploration and Practice on the Classroom Teaching Reform of the Database Technologies Course in colleges. 2018 3rd International Conference on Modern Management, Education Technology, and Social Science (MMETSS 2018) , 320–323.
  • Zhanquan W, Zeping Y, Chunhua G, Fazhi Z, Weibin G. Research of database curriculum construction under the environment of massive open online courses. International Journal of Educational and Pedagogical Sciences. 2016;10(12):3873–3877. [ Google Scholar ]
  • Zheng, Y., & Dong, J. (2011). Teaching reform and practice of database principles. 2011 6th International Conference on Computer Science \& Education (ICCSE) , 1460–1462.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

  • View on publisher site
  • PDF (1.3 MB)
  • Collections

Similar articles

Cited by other articles, links to ncbi databases.

  • Download .nbib .nbib
  • Format: AMA APA MLA NLM

Add to Collections

  • Library Home
  • Research Guides

Writing a Research Paper

Library research guide.

  • Choose Your Topic
  • Evaluate Sources
  • Organize Your Information
  • Draft Your Paper
  • Revise, Review, Refine

How Will This Help Me?

Understanding databases will help you:

  • Identify peer-reviewed articles
  • Effectively perform a search

You can find articles, books, and more in Search It . 

Understand Peer-Reviewed Articles

Steps of peer-reviewed publication

Select a Database

On the databases page you will find more than 300 different databases. To determine which one is best for your project, you can use the menu in the top left corner to search for databases by Subject.

For example, if you are doing research for Sociology, you might want to look in databases just for Sociology. Choosing this option will lead you to a listing of the best databases to use when conducting sociology research. 

Databases Page

A few general guidelines:

  • If you're doing research for an introductory course, such as ENGL 100, try ProQuest Central or Academic Search Premier . They're great places to start.
  • If you need recent newspaper articles, try ProQuest Global Newstream . 
  • << Previous: Search It
  • Next: Evaluate Sources >>
  • Last Updated: Sep 4, 2024 2:42 PM
  • URL: https://guides.lib.k-state.edu/writingresearchpaper

K-State Libraries

1117 Mid-Campus Drive North, Manhattan, KS 66506

785-532-3014 | [email protected]

  • Statements and Disclosures
  • Accessibility
  • © Kansas State University

This website uses cookies to ensure you get the best experience. Learn more about DOAJ’s privacy policy.

Hide this message

You are using an outdated browser. Please upgrade your browser to improve your experience and security.

The Directory of Open Access Journals

Directory of Open Access Journals

Find open access journals & articles.

Doaj in numbers.

80 languages

136 countries represented

13,807 journals without APCs

21,038 journals

10,538,198 article records

Quick search

About the directory.

DOAJ is a unique and extensive index of diverse open access journals from around the world, driven by a growing community, and is committed to ensuring quality content is freely available online for everyone.

DOAJ is committed to keeping its services free of charge, including being indexed, and its data freely available.

→ About DOAJ

→ How to apply

DOAJ is twenty years old in 2023.

Fund our 20th anniversary campaign

DOAJ is independent. All support is via donations.

82% from academic organisations

18% from contributors

Support DOAJ

Publishers don't need to donate to be part of DOAJ.

News Service

Meet the doaj team: head of editorial and deputy head of editorial (quality), vacancy: operations manager, press release: pubscholar joins the movement to support the directory of open access journals, new major version of the api to be released.

→ All blog posts

We would not be able to work without our volunteers, such as these top-performing editors and associate editors.

→ Meet our volunteers

Librarianship, Scholarly Publishing, Data Management

Brisbane, Australia (Chinese, English)

Adana, Türkiye (Turkish, English)

Humanities, Social Sciences

Natalia Pamuła

Toruń, Poland (Polish, English)

Medical Sciences, Nutrition

Pablo Hernandez

Caracas, Venezuela (Spanish, English)

Research Evaluation

Paola Galimberti

Milan, Italy (Italian, German, English)

Social Sciences, Humanities

Dawam M. Rohmatulloh

Ponorogo, Indonesia (Bahasa Indonesia, English, Dutch)

Systematic Entomology

Kadri Kıran

Edirne, Türkiye (English, Turkish, German)

Library and Information Science

Nataliia Kaliuzhna

Kyiv, Ukraine (Ukrainian, Russian, English, Polish)

WeChat QR code

research paper in database

Home

Search Google Appliance

  • Online Book Collections
  • Online Books by Topic
  • Biodiversity Heritage Library
  • Library Catalog (SIRIS)
  • Image Gallery
  • Art & Artist Files
  • Caldwell Lighting
  • Trade Literature
  • All Digital Collections
  • Current Exhibitions
  • Online Exhibitions
  • Past Exhibitions
  • Index of Library & Archival Exhibitions on the Web
  • Research Tools and OneSearch
  • E-journals, E-books, and Databases
  • Smithsonian Research Online (SRO)
  • Borrowing and Access Privileges
  • Smithsonian Libraries and Archives on PRISM (SI staff)
  • E-news Sign Up
  • Internships and Fellowships
  • Work with Us
  • About the Libraries
  • Library Locations
  • Departments
  • History of the Libraries
  • Advisory Board
  • Annual Reports
  • Adopt-a-Book
  • Ways to Give
  • Gifts-in-Kind

You are here

Databases for science research.

This list reflects just some of the science databases available to researchers from the Smithsonian Libraries and Archives. For a complete list of subscription and vetted databases go to E-journals, E-books, and Databases . For more subject-specific resources see our Science Research Guides .

Databases that require SI network for access are indicated by "SI staff." For information about remote access see Off-Site Access to Electronic Resources .

Broad Science Research Databases

  • AGRICOLA : The National Agricultural Library's  comprehensive database covering agriculture and allied disciplines, including: chemistry, engineering, entomology, forestry, social science (general), and water resources.
  • Anthropology Plus  (SI staff): Index of journal articles, and additional resources from core and lesser-known journals from the early 19th century to today. 
  • Encyclopedia of Life (EOL) : Online encyclopedia of all living species, currently number over 1.9 million. 
  • GeoRef  (SI staff): Over 3.4 million references in the geosciences, including journal articles, books, maps, conference papers, reports and theses.
  • Google Scholar : Accessing Google Scholar from the Smithsonian computer network provides access to library-subscribed full text. For more information, see Off-Site Access to Electronic Resources .
  • Journal Citation Reports  (SI staff): Provides impact factors and rankings of many journals in the social and life sciences based on citation analysis.
  • PubMed : Includes more than 22 million citations for biomedical literature from MEDLINE, life science journals, and online books.
  • U.S. Geological Survey Library : Among the largest geoscience library collections in the world.
  • Web of Science: Core Collection  (SI staff): Covers over 12,000 of the highest impact journals worldwide with coverage from 1900 to present. See the Web of Science training portal  for additional help resources. 
  • Worldcat (web)  or WorldCat (OCLC FirstSearch)  (SI staff): Combined search of thousands of library catalogs from around the world, including books, music, videos, and digital content records.
  • Zoological Record  (SI staff): Considered the world's leading taxonomic reference for zoological names, indexing 90% of the world literature in zoology.

Focused Science Databases

For research guides on a variety of natural history topics, including additional databases and resources, see our Science Research Guides .

  • Algaebase : Database of taxonomic, nomenclatural, and distributional information on terrestrial, marine, and freshwater algae organisms.
  • AnimalBase: Early Zoological Literature Online : Hosted by the Zoological Institute of the University of Gottingen, this database provides open access to zoological works from 1550-1770.
  • AnthroSource  (SI staff): Full-text anthropological resources from the breadth and depth of the discipline.
  • AquaDocs : A thematic repository covering the natural marine, coastal, estuarine /brackish and fresh water environments.
  • Birds of North America Online : Comprehensive resource from the Cornell Lab of Ornithology and the American Ornithologists' Union.
  • Catalogue of Life (Species 2000 - ITIS) : Project that catalogued over one million species as of 2001.
  • FishBase : Database covering the breadth of all known species, considered a powerful tool for ecology.
  • GreenFILE  (SI staff): Covers connections between the environment and a variety of disciplines and includes topics such as global climate change, green building, and more.
  • Index of Botanical Publications  (Harvard University Herbaria): Created to assist in the verification of publication names in the Specimen Database and the Gray Index.
  • Index of Botanists  (Harvard University Herbaria): Comprehensive database of authors and collectors in botany, mycology, including systematic publications.
  • IOPI Database of Plant Databases : Hosted by Charles Sturt University, this meta-database allows the user search granularity across 100 metadata fields.
  • International Plant Names Index : Nomenclatural database for the scientific names of vascular plants, linking directly to the Biodiversity Heritage Library .
  • ITIS: Integrated Taxonomic Information System : Created through international partnerships (including the Smithsonian), this database hosts authoritative taxonomic information on plants, animals, fungi, and microbes.
  • JSTOR Global Plants  (SI staff): Includes plant type specimens, taxonomic structures, scientific literature, and related materials.
  • KBD: Kew Bibliographic Databases : Selection of 24 botanical databases containing information on correspondence, herbaria, seed lists, etc.
  • Latindex : Regional Cooperative Online Information System for Scholarly Journals from Latin America, the Caribbean, Spain and Portugal. For more details, visit http://en.wikipedia.org/wiki/Latindex .
  • National Museum of Natural History Research and Collections Information System (EMu) : Over ten million specimen records covering six departments and four divisions of the National Museum of Natural History. 
  • SORA: Searchable Ornithological Research Archive : Developed by the University Libraries at the University of New Mexico, SORA is the world’s largest open access ornithological publications database.

Digitized Science Collections

  • AnimalBase: Early Zoological Literature Online :  Hosted by the Zoological Institute of the University of Gottingen, this database provides open access to zoological works from 1550-1770.
  • Biodiversity Heritage Library : A digital library containing primarily historical texts in the natural sciences.
  • Field Book Project (Smithsonian Archives) : With the purpose of illuminating unpublished works integral to scientific research, the FBP database contains 4,000 digitized field books and 9,500 catalogued field books, in total.
  • Joseph Henry Papers Project (Smithsonian Archives) : The scientific output of the first Secretary of the Smithsonian.  Contains over 170,000 documents in fifteen scientific disciplines.

Reference management. Clean and simple.

How to efficiently search online databases for academic research

How to search online databases

How to access academic databases

How to search academic databases, 1. use the campus network to access research databases, 2. find databases that are specifically related to your topic, 3. set up the search parameters within a database to be as narrow as possible, 4. ask a librarian for help, 5. slowly expand your search to get additional results, 6. use the pro features of the database, 7. try a more general database, if needed, 8. keep track of seminal works, frequently asked questions about searching online databases, related articles.

University libraries provide access to plenty of online academic databases that can yield good results when you use the right strategies. They are among the best sources to turn to when you need to find articles from scholarly journals, books, and other periodicals.

Searching an online research database is much like searching the internet, but the hits returned will be scholarly articles and other academic sources, depending on the subject. In this guide, we highlight 8 tips for searching academic databases.

  • Use college and university library networks.
  • Search subject-specific databases.
  • Set up search parameters.
  • Ask a librarian for help.
  • Narrow or broaden your search, as needed.
  • Use the pro features, where applicable.
  • Try a more general database.
  • Keep track of seminal works.

Tip: The best practice is to use the links provided on your library's website to access academic databases.

Most academic databases cannot be accessed for free. As authoritative resources, these multi-disciplinary databases are comprehensive collections of the current literature on a broad range of topics. Because they have a huge range of publications, public access is sometimes restricted.

College and university libraries pay for subscriptions to popular academic databases. As a student, staff, or faculty member, you can access these resources from home thanks to proxy connections.

➡️ Check out our list of EZProxy connections to see if your institution provides such a service.

Tip: Searching the right databases is key to finding the right academic journals.

Around 2.5 million articles are published EACH year. As a result, it's important to search the right database for the reference you need. Comprehensive databases often contain subject-specific resources and filters and these will help you narrow down your search results. Otherwise, you will have to screen too many unrelated papers that won't give you the reference you want.

Ask a librarian or check your library's A-Z resource list to find out which databases you can access. If you do not know where to start, you can check out the three biggest academic database providers:

➡️ Take a look at our compilations of research databases for computer science or healthcare .

Unlike in a Google search, typing in full sentences will not bring you satisfactory results. Some strategies for narrowing search parameters include:

  • Narrowing your search terms in order to get the most pertinent information from the scholarly resources you are reviewing
  • Narrowing results by filters like specific date range or source type
  • Using more specific keywords

If your university library has a subject specialist in your field, you may want to contact them for guidance on keywords and other subject- and database-specific search strategies. Consider asking a librarian to meet you for a research consultation.

A specific search might not return as many results. This can be good because these results will most likely be current and applicable. If you do not get enough results, however, slowly expand the:

  • type of journal

From there, you'll be able to find a wider variety of related technical reports, books, academic journals, and other potential results that you can use for your research.

Academic search engines and databases are getting smart! In the age of big data and text mining, many databases crunch millions of scientific papers to extract connections between them. Watch out for things like:

  • related relevant articles
  • similar academic resources
  • list of "cited by" or "citations"
  • list of references

When you have thoroughly finished searching a comprehensive database, you can move on to another to find more results. Some databases that cover the same topics might give you the same search results, but they might also cover an entire range of different journals or online resources.

You might prefer the search system of one database over another based on the results you get from keyword searches. One database might have more advanced search options than the other. You can also try a more general database like:

  • Web Of Science

➡️ Visit our list of the best academic research databases .

There are experts in every field, people who have published a lot of scholarly content on your topic, people who get quoted or interviewed a lot and seem to be present almost everywhere. Pay attention to those names when searching a database and once you have found someone interesting, you can search for more from that person.

Also, take note of seminal articles, or those works that have been cited repeatedly within your field. Many major databases for academic journals have features that allow you to quickly determine which articles are cited most frequently.

➡️ Ready to start writing your paper? Visit our guide on how to start a research paper .

Your institution's library provides access to plenty of online research databases. They are among the best sources to turn to when you need to find articles from scholarly journals and periodicals.

Searching the right databases is key to finding the right articles. Ask a librarian or check your library's website to access details. If you do not know where to start, check out the three biggest academic database providers:

Or take a look at our compilation of research database for computer science or healthcare .

You can narrow your search by only including articles within a specific date range or unchecking certain types of journals or magazines that are included in the database but have nothing to do with your topic. Make sure to also use very specific keywords when searching.

Unlike in a Google search, typing in full sentences will not bring you satisfactory results. There are different methods to search different databases. Ask a librarian or do an internet search on how to best search your particular database.

Narrowing down a search might not return many results. If you do not get enough results, slowly expand the date range, type of journal, or keywords.

research paper in database

  • Link to facebook
  • Link to linkedin
  • Link to twitter
  • Link to youtube
  • Writing Tips

10 Free Research and Journal Databases

10 Free Research and Journal Databases

  • 3-minute read
  • 6th April 2019

Finding good research can be tough, especially when so much of it is locked behind paywalls . But there are free resources out there if you know where to look. So to help out, we’ve compiled a list of ten free academic search engines and databases that you should check out.

1. Google Scholar

Even if you’ve not used Google Scholar before, you’ll know Google. And, thus, you can probably guess that Google Scholar is a search engine dedicated to academic work. Not everything listed on Google Scholar will be freely available in full. But it is a good place to start if you’re looking for a specific paper, and many papers can be downloaded for free.

CORE is an open research aggregator. This means it works as a search engine for open access research published by organizations from around the world, all of which is available for free. It is also the world’s largest open access aggregator , so it is a very useful resource for researchers!

Core logo.

3. Bielefeld Academic Search Engine (BASE)

Another dedicated academic search engine, BASE offers access to more than 140 million documents from more than 6,000 sources. Around 60% of these documents are open access, and you can filter results to see only research that is available for free online.

4. Directory of Open Access Journals (DOAJ)

The Directory of Open Access Journals (DOAJ) is a database that lists around 12,000 open access journals covering all areas of science, technology, medicine, social science, and the humanities.

PubMed is a search engine maintained by the NCBI, part of the United States National Library of Medicine. It provides access to more than 29 million citations of biomedical research from MEDLINE, life science journals, and online books. The NCBI runs a similar search engine for research in the chemical sciences called PubChem , too, which is also free to use.

Find this useful?

Subscribe to our newsletter and get writing tips from our editors straight to your inbox.

6. E-Theses Online Service (EThOS)

Run by the British Library, EThOS is a database of over 500,000 doctoral theses. More than half of these are available for free, either directly via EThOS or via a link to a university website.

7. Social Science Research Network (SSRN)

SSRN is a database for research from the social sciences and humanities, including 846,589 research papers from 426,107 researchers across 30 disciplines. Most of these are available for free, although you may need to sign up as a member (also free) to access some services.

8. WorldWideScience

WorldWideScience is a global academic search engine, providing access to national and international scientific databases from across the globe. One interesting feature is that it offers automatic translation, so users can have search results translated into their preferred language.

WorldWideScience logo.

9. Semantic Scholar

Semantic Scholar is an “intelligent” academic search engine. It uses machine learning to prioritize the most important research, which can make it easier to find relevant literature. Or, in Semantic Scholar’s own words, it uses influential citations, images, and key phrases to “cut through the clutter.”

10. Public Library of Science (PLOS)

PLOS is an open-access research organization that publishes several journals. But as well as publishing its own research, PLOS is a dedicated advocate for open-access learning. So if you appreciate the search engines and databases we’ve listed here, check out the rest of the PLOS site to find out more about their campaign to enable access to knowledge.

Share this article:

' src=

Post A New Comment

Got content that needs a quick turnaround? Let us polish your work. Explore our editorial business services.

5-minute read

Free Email Newsletter Template

Promoting a brand means sharing valuable insights to connect more deeply with your audience, and...

6-minute read

How to Write a Nonprofit Grant Proposal

If you’re seeking funding to support your charitable endeavors as a nonprofit organization, you’ll need...

9-minute read

How to Use Infographics to Boost Your Presentation

Is your content getting noticed? Capturing and maintaining an audience’s attention is a challenge when...

8-minute read

Why Interactive PDFs Are Better for Engagement

Are you looking to enhance engagement and captivate your audience through your professional documents? Interactive...

7-minute read

Seven Key Strategies for Voice Search Optimization

Voice search optimization is rapidly shaping the digital landscape, requiring content professionals to adapt their...

4-minute read

Five Creative Ways to Showcase Your Digital Portfolio

Are you a creative freelancer looking to make a lasting impression on potential clients or...

Logo Harvard University

Make sure your writing is the best it can be with our expert English proofreading and editing.

An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List

The Canadian Journal of Hospital Pharmacy logo

Qualitative Research: Getting Started

Zubin austin, jane sutton.

  • Author information
  • Copyright and License information

Address correspondence to: Dr Zubin Austin, Leslie Dan Faculty of Pharmacy, University of Toronto, 144 College Street, Toronto ON M5S 3M2, e-mail: [email protected]

INTRODUCTION

As scientifically trained clinicians, pharmacists may be more familiar and comfortable with the concept of quantitative rather than qualitative research. Quantitative research can be defined as “the means for testing objective theories by examining the relationship among variables which in turn can be measured so that numbered data can be analyzed using statistical procedures”. 1 Pharmacists may have used such methods to carry out audits or surveys within their own practice settings; if so, they may have had a sense of “something missing” from their data. What is missing from quantitative research methods is the voice of the participant. In a quantitative study, large amounts of data can be collected about the number of people who hold certain attitudes toward their health and health care, but what qualitative study tells us is why people have thoughts and feelings that might affect the way they respond to that care and how it is given (in this way, qualitative and quantitative data are frequently complementary). Possibly the most important point about qualitative research is that its practitioners do not seek to generalize their findings to a wider population. Rather, they attempt to find examples of behaviour, to clarify the thoughts and feelings of study participants, and to interpret participants’ experiences of the phenomena of interest, in order to find explanations for human behaviour in a given context.

WHAT IS QUALITATIVE RESEARCH?

Much of the work of clinicians (including pharmacists) takes place within a social, clinical, or interpersonal context where statistical procedures and numeric data may be insufficient to capture how patients and health care professionals feel about patients’ care. Qualitative research involves asking participants about their experiences of things that happen in their lives. It enables researchers to obtain insights into what it feels like to be another person and to understand the world as another experiences it.

Qualitative research was historically employed in fields such as sociology, history, and anthropology. 2 Miles and Huberman 2 said that qualitative data “are a source of well-grounded, rich descriptions and explanations of processes in identifiable local contexts. With qualitative data one can preserve chronological flow, see precisely which events lead to which consequences, and derive fruitful explanations.” Qualitative methods are concerned with how human behaviour can be explained, within the framework of the social structures in which that behaviour takes place. 3 So, in the context of health care, and hospital pharmacy in particular, researchers can, for example, explore how patients feel about their care, about their medicines, or indeed about “being a patient”.

THE IMPORTANCE OF METHODOLOGY

Smith 4 has described methodology as the “explanation of the approach, methods and procedures with some justification for their selection.” It is essential that researchers have robust theories that underpin the way they conduct their research—this is called “methodology”. It is also important for researchers to have a thorough understanding of various methodologies, to ensure alignment between their own positionality (i.e., bias or stance), research questions, and objectives. Clinicians may express reservations about the value or impact of qualitative research, given their perceptions that it is inherently subjective or biased, that it does not seek to be reproducible across different contexts, and that it does not produce generalizable findings. Other clinicians may express nervousness or hesitation about using qualitative methods, claiming that their previous “scientific” training and experience have not prepared them for the ambiguity and interpretative nature of qualitative data analysis. In both cases, these clinicians are depriving themselves of opportunities to understand complex or ambiguous situations, phenomena, or processes in a different way.

Qualitative researchers generally begin their work by recognizing that the position (or world view) of the researcher exerts an enormous influence on the entire research enterprise. Whether explicitly understood and acknowledged or not, this world view shapes the way in which research questions are raised and framed, methods selected, data collected and analyzed, and results reported. 5 A broad range of different methods and methodologies are available within the qualitative tradition, and no single review paper can adequately capture the depth and nuance of these diverse options. Here, given space constraints, we highlight certain options for illustrative purposes only, emphasizing that they are only a sample of what may be available to you as a prospective qualitative researcher. We encourage you to continue your own study of this area to identify methods and methodologies suitable to your questions and needs, beyond those highlighted here.

The following are some of the methodologies commonly used in qualitative research:

Ethnography generally involves researchers directly observing participants in their natural environments over time. A key feature of ethnography is the fact that natural settings, unadapted for the researchers’ interests, are used. In ethnography, the natural setting or environment is as important as the participants, and such methods have the advantage of explicitly acknowledging that, in the real world, environmental constraints and context influence behaviours and outcomes. 6 An example of ethnographic research in pharmacy might involve observations to determine how pharmacists integrate into family health teams. Such a study would also include collection of documents about participants’ lives from the participants themselves and field notes from the researcher. 7

Grounded theory, first described by Glaser and Strauss in 1967, 8 is a framework for qualitative research that suggests that theory must derive from data, unlike other forms of research, which suggest that data should be used to test theory. Grounded theory may be particularly valuable when little or nothing is known or understood about a problem, situation, or context, and any attempt to start with a hypothesis or theory would be conjecture at best. 9 An example of the use of grounded theory in hospital pharmacy might be to determine potential roles for pharmacists in a new or underserviced clinical area. As with other qualitative methodologies, grounded theory provides researchers with a process that can be followed to facilitate the conduct of such research. As an example, Thurston and others 10 used constructivist grounded theory to explore the availability of arthritis care among indigenous people of Canada and were able to identify a number of influences on health care for this population.

Phenomenology attempts to understand problems, ideas, and situations from the perspective of common understanding and experience rather than differences. 10 Phenomenology is about understanding how human beings experience their world. It gives researchers a powerful tool with which to understand subjective experience. In other words, 2 people may have the same diagnosis, with the same treatment prescribed, but the ways in which they experience that diagnosis and treatment will be different, even though they may have some experiences in common. Phenomenology helps researchers to explore those experiences, thoughts, and feelings and helps to elicit the meaning underlying how people behave. As an example, Hancock and others 11 used a phenomenological approach to explore health care professionals’ views of the diagnosis and management of heart failure since publication of an earlier study in 2003. Their findings revealed that barriers to effective treatment for heart failure had not changed in 10 years and provided a new understanding of why this was the case.

ROLE OF THE RESEARCHER

For any researcher, the starting point for research must be articulation of his or her research world view. This core feature of qualitative work is increasingly seen in quantitative research too: the explicit acknowledgement of one’s position, biases, and assumptions, so that readers can better understand the particular researcher. Reflexivity describes the processes whereby the act of engaging in research actually affects the process being studied, calling into question the notion of “detached objectivity”. Here, the researcher’s own subjectivity is as critical to the research process and output as any other variable. Applications of reflexivity may include participant-observer research, where the researcher is actually one of the participants in the process or situation being researched and must then examine it from these divergent perspectives. 12 Some researchers believe that objectivity is a myth and that attempts at impartiality will fail because human beings who happen to be researchers cannot isolate their own backgrounds and interests from the conduct of a study. 5 Rather than aspire to an unachievable goal of “objectivity”, it is better to simply be honest and transparent about one’s own subjectivities, allowing readers to draw their own conclusions about the interpretations that are presented through the research itself. For new (and experienced) qualitative researchers, an important first step is to step back and articulate your own underlying biases and assumptions. The following questions can help to begin this reflection process:

Why am I interested in this topic? To answer this question, try to identify what is driving your enthusiasm, energy, and interest in researching this subject.

What do I really think the answer is? Asking this question helps to identify any biases you may have through honest reflection on what you expect to find. You can then “bracket” those assumptions to enable the participants’ voices to be heard.

What am I getting out of this? In many cases, pressures to publish or “do” research make research nothing more than an employment requirement. How does this affect your interest in the question or its outcomes, or the depth to which you are willing to go to find information?

What do others in my professional community think of this work—and of me? As a researcher, you will not be operating in a vacuum; you will be part of a complex social and interpersonal world. These external influences will shape your views and expectations of yourself and your work. Acknowledging this influence and its potential effects on personal behaviour will facilitate greater self-scrutiny throughout the research process.

FROM FRAMEWORKS TO METHODS

Qualitative research methodology is not a single method, but instead offers a variety of different choices to researchers, according to specific parameters of topic, research question, participants, and settings. The method is the way you carry out your research within the paradigm of quantitative or qualitative research.

Qualitative research is concerned with participants’ own experiences of a life event, and the aim is to interpret what participants have said in order to explain why they have said it. Thus, methods should be chosen that enable participants to express themselves openly and without constraint. The framework selected by the researcher to conduct the research may direct the project toward specific methods. From among the numerous methods used by qualitative researchers, we outline below the three most frequently encountered.

DATA COLLECTION

Patton 12 has described an interview as “open-ended questions and probes yielding in-depth responses about people’s experiences, perceptions, opinions, feelings, and knowledge. Data consists of verbatim quotations and sufficient content/context to be interpretable”. Researchers may use a structured or unstructured interview approach. Structured interviews rely upon a predetermined list of questions framed algorithmically to guide the interviewer. This approach resists improvisation and following up on hunches, but has the advantage of facilitating consistency between participants. In contrast, unstructured or semistructured interviews may begin with some defined questions, but the interviewer has considerable latitude to adapt questions to the specific direction of responses, in an effort to allow for more intuitive and natural conversations between researchers and participants. Generally, you should continue to interview additional participants until you have saturated your field of interest, i.e., until you are not hearing anything new. The number of participants is therefore dependent on the richness of the data, though Miles and Huberman 2 suggested that more than 15 cases can make analysis complicated and “unwieldy”.

Focus Groups

Patton 12 has described the focus group as a primary means of collecting qualitative data. In essence, focus groups are unstructured interviews with multiple participants, which allow participants and a facilitator to interact freely with one another and to build on ideas and conversation. This method allows for the collection of group-generated data, which can be a challenging experience.

Observations

Patton 12 described observation as a useful tool in both quantitative and qualitative research: “[it involves] descriptions of activities, behaviours, actions, conversations, interpersonal interactions, organization or community processes or any other aspect of observable human experience”. Observation is critical in both interviews and focus groups, as nonalignment between verbal and nonverbal data frequently can be the result of sarcasm, irony, or other conversational techniques that may be confusing or open to interpretation. Observation can also be used as a stand-alone tool for exploring participants’ experiences, whether or not the researcher is a participant in the process.

Selecting the most appropriate and practical method is an important decision and must be taken carefully. Those unfamiliar with qualitative research may assume that “anyone” can interview, observe, or facilitate a focus group; however, it is important to recognize that the quality of data collected through qualitative methods is a direct reflection of the skills and competencies of the researcher. 13 The hardest thing to do during an interview is to sit back and listen to participants. They should be doing most of the talking—it is their perception of their own life-world that the researcher is trying to understand. Sophisticated interpersonal skills are required, in particular the ability to accurately interpret and respond to the nuanced behaviour of participants in various settings. More information about the collection of qualitative data may be found in the “Further Reading” section of this paper.

It is essential that data gathered during interviews, focus groups, and observation sessions are stored in a retrievable format. The most accurate way to do this is by audio-recording (with the participants’ permission). Video-recording may be a useful tool for focus groups, because the body language of group members and how they interact can be missed with audio-recording alone. Recordings should be transcribed verbatim and checked for accuracy against the audio- or video-recording, and all personally identifiable information should be removed from the transcript. You are then ready to start your analysis.

DATA ANALYSIS

Regardless of the research method used, the researcher must try to analyze or make sense of the participants’ narratives. This analysis can be done by coding sections of text, by writing down your thoughts in the margins of transcripts, or by making separate notes about the data collection. Coding is the process by which raw data (e.g., transcripts from interviews and focus groups or field notes from observations) are gradually converted into usable data through the identification of themes, concepts, or ideas that have some connection with each other. It may be that certain words or phrases are used by different participants, and these can be drawn together to allow the researcher an opportunity to focus findings in a more meaningful manner. The researcher will then give the words, phrases, or pieces of text meaningful names that exemplify what the participants are saying. This process is referred to as “theming”. Generating themes in an orderly fashion out of the chaos of transcripts or field notes can be a daunting task, particularly since it may involve many pages of raw data. Fortunately, sophisticated software programs such as NVivo (QSR International Pty Ltd) now exist to support researchers in converting data into themes; familiarization with such software supports is of considerable benefit to researchers and is strongly recommended. Manual coding is possible with small and straightforward data sets, but the management of qualitative data is a complexity unto itself, one that is best addressed through technological and software support.

There is both an art and a science to coding, and the second checking of themes from data is well advised (where feasible) to enhance the face validity of the work and to demonstrate reliability. Further reliability-enhancing mechanisms include “member checking”, where participants are given an opportunity to actually learn about and respond to the researchers’ preliminary analysis and coding of data. Careful documentation of various iterations of “coding trees” is important. These structures allow readers to understand how and why raw data were converted into a theme and what rules the researcher is using to govern inclusion or exclusion of specific data within or from a theme. Coding trees may be produced iteratively: after each interview, the researcher may immediately code and categorize data into themes to facilitate subsequent interviews and allow for probing with subsequent participants as necessary. At the end of the theming process, you will be in a position to tell the participants’ stories illustrated by quotations from your transcripts. For more information on different ways to manage qualitative data, see the “Further Reading” section at the end of this paper.

ETHICAL ISSUES

In most circumstances, qualitative research involves human beings or the things that human beings produce (documents, notes, etc.). As a result, it is essential that such research be undertaken in a manner that places the safety, security, and needs of participants at the forefront. Although interviews, focus groups, and questionnaires may seem innocuous and “less dangerous” than taking blood samples, it is important to recognize that the way participants are represented in research can be significantly damaging. Try to put yourself in the shoes of the potential participants when designing your research and ask yourself these questions:

Are the requests you are making of potential participants reasonable?

Are you putting them at unnecessary risk or inconvenience?

Have you identified and addressed the specific needs of particular groups?

Where possible, attempting anonymization of data is strongly recommended, bearing in mind that true anonymization may be difficult, as participants can sometimes be recognized from their stories. Balancing the responsibility to report findings accurately and honestly with the potential harm to the participants involved can be challenging. Advice on the ethical considerations of research is generally available from research ethics boards and should be actively sought in these challenging situations.

GETTING STARTED

Pharmacists may be hesitant to embark on research involving qualitative methods because of a perceived lack of skills or confidence. Overcoming this barrier is the most important first step, as pharmacists can benefit from inclusion of qualitative methods in their research repertoire. Partnering with others who are more experienced and who can provide mentorship can be a valuable strategy. Reading reports of research studies that have utilized qualitative methods can provide insights and ideas for personal use; such papers are routinely included in traditional databases accessed by pharmacists. Engaging in dialogue with members of a research ethics board who have qualitative expertise can also provide useful assistance, as well as saving time during the ethics review process itself. The references at the end of this paper may provide some additional support to allow you to begin incorporating qualitative methods into your research.

CONCLUSIONS

Qualitative research offers unique opportunities for understanding complex, nuanced situations where interpersonal ambiguity and multiple interpretations exist. Qualitative research may not provide definitive answers to such complex questions, but it can yield a better understanding and a springboard for further focused work. There are multiple frameworks, methods, and considerations involved in shaping effective qualitative research. In most cases, these begin with self-reflection and articulation of positionality by the researcher. For some, qualitative research may appear commonsensical and easy; for others, it may appear daunting, given its high reliance on direct participant– researcher interactions. For yet others, qualitative research may appear subjective, unscientific, and consequently unreliable. All these perspectives reflect a lack of understanding of how effective qualitative research actually occurs. When undertaken in a rigorous manner, qualitative research provides unique opportunities for expanding our understanding of the social and clinical world that we inhabit.

Further Reading

  • Breakwell GM, Hammond S, Fife-Schaw C, editors. Research methods in psychology. Thousand Oaks (CA): Sage Publications Ltd; 1995. [ Google Scholar ]
  • Strauss A, Corbin J. Basics of qualitative research. Thousand Oaks (CA): Sage Publications Ltd; 1998. [ Google Scholar ]
  • Willig C. Introducing qualitative research in psychology. Buckingham (UK): Open University Press; 2001. [ Google Scholar ]
  • Guest G, Namey EE, Mitchel ML. Collecting qualitative data: a field manual for applied research. Thousand Oaks (CA): Sage Publications Ltd; 2013. [ Google Scholar ]
  • Ogden R. Bias. In: Given LM, editor. The Sage encyclopedia of qualitative research methods. Thousand Oaks (CA): Sage Publications Inc; 2008. pp. 61–2. [ Google Scholar ]

This article is the seventh in the CJHP Research Primer Series, an initiative of the CJHP Editorial Board and the CSHP Research Committee. The planned 2-year series is intended to appeal to relatively inexperienced researchers, with the goal of building research capacity among practising pharmacists. The articles, presenting simple but rigorous guidance to encourage and support novice researchers, are being solicited from authors with appropriate expertise.

Previous article in this series:

Bond CM. The research jigsaw: how to get started. Can J Hosp Pharm . 2014;67(1):28–30.

Tully MP. Research: articulating questions, generating hypotheses, and choosing study designs. Can J Hosp Pharm . 2014;67(1):31–4.

Loewen P. Ethical issues in pharmacy practice research: an introductory guide. Can J Hosp Pharm. 2014;67(2):133–7.

Tsuyuki RT. Designing pharmacy practice research trials. Can J Hosp Pharm . 2014;67(3):226–9.

Bresee LC. An introduction to developing surveys for pharmacy practice research. Can J Hosp Pharm . 2014;67(4):286–91.

Gamble JM. An introduction to the fundamentals of cohort and case–control studies. Can J Hosp Pharm . 2014;67(5):366–72.

Competing interests: None declared.

  • 1. Creswell J. Research design: qualitative, quantitative, and mixed methods approaches. Thousand Oaks (CA): Sage Publications Ltd; 2009. [ Google Scholar ]
  • 2. Miles B, Huberman AM. Qualitative data analysis. Thousand Oaks (CA): Sage Publications Ltd; 2009. [ Google Scholar ]
  • 3. Flick U, Von Kardorff E, Steinke I. A companion to qualitative research. Thousand Oaks (CA): Sage Publications Ltd; 2004. [ Google Scholar ]
  • 4. Smith FJ. Research methods in pharmacy practice. London (UK): Pharmaceutical Press; 2002. [ Google Scholar ]
  • 5. Flick U. An introduction to qualitative research. Thousand Oaks (CA): Sage Publications Ltd; 2009. [ Google Scholar ]
  • 6. Hammersley M, Atkinson P. Ethnography: principles in practice. New York (NY): Taylor and Francis; 2007. [ Google Scholar ]
  • 7. Latif A, Boardman HF, Pollock K. A qualitative study exploring the impact and consequence of the medicines use review service on pharmacy support-staff. Pharm Pract. 2013;11(2):118–24. doi: 10.4321/s1886-36552013000200009. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 8. What is grounded theory? Mill Valley (CA): Grounded Theory Institute; 2008. [cited 2014 Sep 29]. Available from: www.groundedtheory.com/what-is-gt.aspx . [ Google Scholar ]
  • 9. Glaser BG, Strauss AL. The discovery of grounded theory. San Francisco (CA): Sociology Press; 1967. [ Google Scholar ]
  • 10. Thurston WE, Coupal S, Jones CA, Crowshoe LF, Marshall DA, Homik J, et al. Discordant indigenous and provider frames explain challenges in improving access to arthritis care: a qualitative study using constructivist grounded theory. Int J Equity Health. 2014;13:46. doi: 10.1186/1475-9276-13-46. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 11. Hancock HC, Close H, Fuat A, Murphy JJ, Hungin AP, Mason JM. Barriers to accurate diagnosis and effective management of heart failure have not changed in the past 10 years: a qualitative study and national survey. BMJ Open. 2014;4(3):e003866. doi: 10.1136/bmjopen-2013-003866. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 12. Patton M. Qualitative research and evaluation methods. Thousand Oaks (CA): Sage Publications Ltd; 2002. [ Google Scholar ]
  • 13. Arksey H, Knight P. Interviewing for social scientists: an introductory resource with examples. Thousand Oaks (CA): Sage Publications Ltd; 1999. [ Google Scholar ]
  • View on publisher site
  • PDF (128.2 KB)
  • Collections

Similar articles

Cited by other articles, links to ncbi databases.

  • Download .nbib .nbib
  • Format: AMA APA MLA NLM

Add to Collections

Duquesne University Logo

Computer Science

  • Introduction
  • Articles, Journals, & Databases
  • Books/eBooks
  • Web Resources
  • Citing & Writing
  • Related Guides
  • Getting Help

Articles, Journals, & Databases

Below is the link to our full list of databases, as well as some key databases that may be helpful for you to consider.

  • Databases A-Z
  • arXiv This link opens in a new window An archive and distribution server for electronic pre-prints approved for posting after moderation, but not full peer review. Open access to 1,500,749 e-prints in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics.
  • Emerald Insight This link opens in a new window Publisher of the world's widest range of management and library & information services journals, as well as a strong specialist range of engineering, applied science and technology journals.
  • Computer Science Database (ProQuest) This link opens in a new window Search top computing journals in full text for research on subjects such as database design, software development, web commerce, LANs, WANs, Intranets, and the Internet. The database includes over 350 titles, with nearly 300 available in full text.
  • Gartner.com – IT Research This link opens in a new window This database includes research findings on a wide range of IT-related topics, including analyses, opinions, trends, leading practices, and case studies.
  • Science Journals Database (ProQuest) This link opens in a new window Coverage dates back to 1986 and featuring over 1,145 titles—with more than 965 available in full text. In full-text format, researchers have access to all the charts, diagrams, graphs, tables, photos, and other graphical elements so vital to scientific and technical literature.
  • SPIE Digital Library and eBooks This link opens in a new window Collection of optics and photonics research from conference proceedings, and peer-review journals.
  • Scopus This link opens in a new window The largest abstract and citation database of research literature and quality web sources. indexes content from 24,600 active titles and 5,000 publishers which is rigorously vetted and selected by an independent review board, and uses a rich underlying metadata architecture to connect people, published ideas and institutions.

Duquesne affiliates have access to several journals relevant to biotechnology. See a searchable list of journals at the link below.

  • Browzine - Computer and Information Science Journals (Opens New Window)
  • List of Computer Science Journals (Opens New Window)
  • List of Information Technology Journals (Opens New Window)

ILL and Finding Full Text

Full text options.

This guide explains some of the most frequently used ways to find full text at Duquesne.

  • Finding Full Text Guide

This tool will pop up in many databases and web pages and help you find full text. It takes less than one minute to setup.

  • See our  LibKey Nomad guide  for more information and help with setup.

Borrow From Other Libraries

There are ways of getting access to materials not available at Duquesne by contacting and  borrowing from other libraries .

When Gumberg doesn’t have an article, you can order it for free electronic delivery using ILLiad.

  • See our  ILLiad Guide  for more information.
  • << Previous: Introduction
  • Next: Books/eBooks >>
  • Last Updated: Oct 22, 2024 11:57 AM
  • URL: https://guides.library.duq.edu/computer-science

Human Relations Area Files

Cultural information for education and research, interview paper: thinking through an interview guide.

Return to Teaching eHRAF: Tile View | Table View

View exercise overview

Kiran c. jayaram, university of south florida.

COURSE INFORMATION

Number:                      ANT 2410 Title:                           Cultural Anthropology Time/Location:           TBD Prerequisites:              None Instructor:                   Prof. Kiran C. Jayaram

  ASSIGNMENT

Interview paper: Thinking Through an Interview Guide

Students will analyze how diversity affects interactions with major societal institutions (such as health care, criminal justice, education, employment, voting, military) from contemporary and/or historical perspectives.  Additionally, the project will facilitate understanding of how global issues and systems are experienced differently at local scales by identifying local problems via fieldwork and course readings and applying them to larger global concerns.

This assignment will still require you to develop an interview guide, but you will use that as a prompt to discuss specific questions.  In other words, this paper represents your ability to integrate data, insight, and reflection.  NOTE:  Because this entire research project builds upon previous classwork, you may be able to use some lines of text from the first paper in this one.

PART 1:  Construct an Interview Guide

Develop a list of 7-10 question interview guide that you would ask a potential research participant.  These should be issues that you want to clarify, confirm, or examine more deeply.

PART 2:  Interview Paper

Write a well-organized and well-written essay of 1000-2000 words (four to eight full pages), not including Works Cited or subsequent material. To do this, you will need to reference specific sources

  Paper Sections–use these headings in your paper!

  • INTRODUCTION:  “What is this paper about?”
  • Include one or two sentences that describe what a semi-structured interview is and what type of data it can generate. Make sure to use and cite course readings, particularly when explaining the general definition of a topic.
  • Identify the research population and location.
  • An example could be “In this paper, I show how questions to a key informant about [short summary of topics of the questions] can help us understand diversity within a population and relationships between the local and global.”
  • Provide a brief description of the paper’s structure.
  • CONTEXT:  “Setting the scene”
  • This section provides background information about your research project.  It allows you to situate your local research project within a broader cultural framework.  In other words, what general information helps the reader understand the specificity of your research?
  • The research population demographics (age range, gender breakdown, class, ethnic or racial categories) at the local, regional, national, or global levels,
  • The setting for the overall research project, or
  • The setting for the interview.
  • Include in-text citations and create a Works Cited section (in Chicago Author-Date style ) for information that you introduce.
  • METHODS:  “What did you do?”

Describe what sources of data you used.  If you used a database, name it.  You do not need to put the name of books or articles, but you should cite them.

  • Brief example: “In addition to the eHRAF database , I also used several ethnographic books (author A 20??; author F 20??; author N 19??).
  • QUESTIONS (7 minimum to 10 maximum):  “What could you learn?”

In this section, you introduce the topic you want to learn about, provide the question, an explanation of what kind of information the question elicits (either methodologically or conceptually as linked to keywords in the required textbook), and a potential answer based upon ethnographic sources.

  • Massive source of written texts on populations all over the globe.   HIGHLY RECOMMENDED
  • Source of major US Anthropology scholarship through academic journals
  • Source of documentary and non-fiction films
  • Make sure to cite your sources in Chicago Author-Date format.
  • DIVERSITY DISCUSSION:  “How do your data show internal group diversity and similarities across other groups?”
  • Give a specific example of diversity a) within the population at your field site and b) between you and the population?
  • Explain how these differences may affect who you are able interview, how people answers your questions, and how you interpret their answers.
  • Local-to-Global:
  • How might the answers people give vary if you conducted the same interview with the same person, but in a different location?  Make sure to explain the different locations as well as what major systems under the umbrella of culture may affect the answers.
  • CONCLUSION:  “Why are your findings important?”
  • Restate your reason for constructing the interview guide.
  • Restate the thesis statement.
  • Propose future research or implications of your research for government policy or institutional practice.
  • WORKS CITED:  “What sources did you use?”
  • Make sure that you use Chicago Author-Date format for any in-text citations and their references in the Works Cited.
  • QUESTIONS:  “What did you ask?”
  • Number and list the questions you described in your paper.

Post the .doc or .docx file on Canvas by 11:59pm on the due date.  Failure to adhere to the font (type, size), margins, or paper format as described in the syllabus will cause a loss of points.

  Evaluation

You will be graded out of 100 total points, according to the following rubric:

  • 0-10 points:     Introductory Paragraph
  • 0-10 points:      Context
  • 0-40 points:     In-depth Discussion of Questions
  • 0-10 points:     Diversity discussion
  • 0-10 points:     Conclusion
  • 0-10 points:     Appropriate use of citations and correct reference format
  • 0-10 points:     Use of at least two ethnographic sources to inform answers

Subscribe to the PwC Newsletter

Join the community, edit social preview.

research paper in database

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK REMOVE

Remove a task

Add a method, remove a method, edit datasets, large language models in qualitative research: can we do the data justice.

9 Oct 2024  ·  Hope Schroeder , Marianne Aubin Le Quéré , Casey Randazzo , David Mimno , Sarita Schoenebeck · Edit social preview

Qualitative researchers use tools to collect, sort, and analyze their data. Should qualitative researchers use large language models (LLMs) as part of their practice? LLMs could augment qualitative research, but it is unclear if their use is appropriate, ethical, or aligned with qualitative researchers' goals and values. We interviewed twenty qualitative researchers to investigate these tensions. Many participants see LLMs as promising interlocutors with attractive use cases across the stages of research, but wrestle with their performance and appropriateness. Participants surface concerns regarding the use of LLMs while protecting participant interests, and call attention to an urgent lack of norms and tooling to guide the ethical use of LLMs in research. Given the importance of qualitative methods to human-computer interaction, we use the tensions surfaced by our participants to outline guidelines for researchers considering using LLMs in qualitative research and design principles for LLM-assisted qualitative data analysis tools.

Code Edit Add Remove Mark official

Datasets edit.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 22 October 2024

Foundation models assist in human–robot collaboration assembly

  • Yuchen Ji 1 ,
  • Zequn Zhang 1 ,
  • Dunbing Tang 1 ,
  • Yi Zheng 2 ,
  • Changchun Liu 1 ,
  • Zhen Zhao 1 &
  • Xinghui Li 3  

Scientific Reports volume  14 , Article number:  24828 ( 2024 ) Cite this article

Metrics details

  • Engineering
  • Mechanical engineering

Human–robot collaboration (HRC) is a novel manufacturing paradigm designed to fully leverage the advantage of humans and robots, efficiently and flexibly accomplishing customized manufacturing tasks. However, existing HRC systems lack the transfer and generalization capability for environment perception and task reasoning. These limitations manifest in: (1) current methods rely on specialized models to perceive scenes; and need retraining the model when facing unseen objects. (2) current methods only address predefined tasks, and cannot support undefined task reasoning. To avoid these limitations, this paper proposes a novel HRC approach based on Foundation Models (FMs), including Large Language models (LLMs) and Vision Foundation Models (VFMs). Specifically, a LLMs-based task reasoning method is introduced, utilizing prompt learning to transfer LLMs into the domain of HRC tasks, supporting undefined task reasoning. A VFMs-based scene semantic perception method is proposed, integrating various VFMs to achieve scene perception without training. Finally, a FMs-based HRC system is developed, comprising perception, reasoning, and execution modules for more flexible and generalized HRC. The superior performances of FMs in perception and reasoning are demonstrated by extensive experiments. Furthermore, the feasibility and effectiveness of the FMs-based HRC system are validated through an part assembly case involving a satellite component model.

Introduction

In the field of intelligent manufacturing, with the development of technologies such as Artificial Intelligence (AI) 1 , industrial internet of things (IIOT) 2 , and Digital Twins (DT) 3 , the roles of humans and machines in the manufacturing process are being redefined. leading to a reshaping of traditional manufacturing models. This transformation shifts the manufacturing paradigm from machine-centric to human-centric, which has attracted widespread attention and exploration of Human–Robot Collaboration (HRC) 4 in the community. Industry 5.0 5 , Human-Cyber-Physical Systems (HCPS) 6 , human-centered manufacturing, etc., also provide theoretical support for the HRC manufacturing concept. Existing manufacturing systems cannot adapt to Multi-variety and small-batch production tasks. In contrast, HRC’s manufacturing model makes full use of human flexibility and the repetitive operation characteristics of machines, improves the flexibility and efficiency of the manufacturing system, and provides good solutions for personalized production needs.

The existing HRC system can almost only execute predefined workflows and does not have the reasoning ability for unseen tasks. To support new environments and tasks, they often need to refactor code and retrain models. The reasons for these limitations mainly include: (1) Existing methods simply use predefined human–robot workflows, or build knowledge graphs of human instructions and robot actions, causing them to be limited to specific environments and tasks, and the encoded knowledge is limited, has little ability to reason for new tasks and temporarily extend intention. (2) Existing methods use specialized models for scene semantic perception. For unseen objects in the environment, data needs to be collected and models need to be retrained. Even if there are only incremental changes, this is inevitable.

Recently, foundation models (FMs), including Large Language Models (LLMs) and Vision Foundation Models (VFMs), have become a hot issue in the research community, and scholars have developed a wealth of applications based on FMs. However, these applications almost only focus on the information level and lack interaction with the real world. Especially in engineering, FMs have not been fully and effectively utilized. This paper finds that the powerful understanding, reasoning, and generalization capabilities of FMs are very consistent with HRC’s adaptability requirements for multiple scenarios and tasks. It can solve the limitations of the existing HRC system, and achieve transferable scene semantic perception and generalized task reasoning.

This paper aims to use FMs to build a flexible and generalized HRC system that can understand undefined human instructions, and reason to robot control codes that comply with environmental constraints. At the same time, it can achieve transferable perception when facing unseen objects. Specifically, LLMs serve as the “brain” of the HRC system, receiving environmental status and human instructions to reason out reasonable robot codes. VFMs are the “eyes” of the HRC system, used for semantic perception of scenes and assisting robots in downstream tasks such as grasping. To realize the functions of the above HRC system, the main contributions of this paper are as follows:

A HRC framework based on FMs is proposed, and the problem statement, functional module division, general solutions and system structure are introduced in detail.

A HRC task reasoning method based on LLMs is presented. The prompt template is constructed, which transfers LLMs to the HRC field for undefined human instruction understanding and task reasoning. Extensive experiments indicate the superior performance of LLMs on HRC tasks.

A VFMs-based scene semantic perception method is proposed, which uses multiple VFMs and Principal Components Analysis (PCA) to achieve view-independent semantic perception without training. Experiments show its superiority in recognition accuracy.

The feasibility and effectiveness of the proposed HRC system based on FMs are validated, through a HRC part assembly case of a satellite component model.

LLMs for robot task reasoning

Language-based robot control has a long research history, and the similarities between languages also provide the potential for the generalization of robot tasks. There is a lot of work in this direction: CLIPort 7 proposes a two-stream framework that utilizes Vison Language Model (VLM) 8 trained on large-scale Internet data. It receives data from visual and language modalities to enable dexterous robot operation. VIMA 9 extends CLIPort to implement robot tasks that are inconvenient to express in words, through multi-modal prompts and interweaving visual and textual features. However, these methods can only achieve generalization between skills, that is, use trained skills to operate unseen objects, but cannot explore new skills.

Recently, LLMs trained on large-scale datasets have shown excellent abilities in understanding, reasoning, and multi-task generalization. SayCan 10 uses LLMs as robot task planners to decompose human abstract instructions into pre-trained robot skills, and access a language-based reinforcement learning model to perform specific robot actions. Grounded Decoding 11 integrates grounded models based on SayCan, which is used to constrain the decoding of LLMs based on real-world observation, allowing it to consider factors such as feasibility, preference, and safety. Text2Motion 12 uses LLMs to generate task goals while generating task planning, and corrects the task based on the execution situation and task goals to avoid final task failure caused by accumulated errors. VoxPoser 13 generates more fine-grained trajectory constraints based on Text2Motion, and extends the experimental settings to a more general environment. PaLM-E 14 is eye-catching, which is the first work that introduces multi-modality LLMs to realize long-horizon robot task reasoning. It embodies the pre-trained PaLM 15 and supports image and neural scene representation 16 , sensor status, and output robot instructions based on human language instructions. CaP 17 can take in natural language commands and write reactive and waypoint-based policies in code format.

However, these methods mainly focus on simple desktop scenarios and lack interactive feedback. This paper hopes to use the task planning capabilities of LLMs to empower the manufacturing industry, especially for the HRC manufacturing model. It can significantly improve the flexibility of the HRC system and provide a more natural way of collaboration.

VFMs for multiple visual tasks

With the widespread popularity of large-scale datasets 18 , 19 , 20 , contrastive learning 21 , and vision transformers 22 , visual foundation models (VFMs) have been greatly developed. CLIP 8 is a very representative work among them, which is trained on billion-scale image-text datasets and aligns images and text through contrastive learning. CLIP can directly perform zero-shot image retrieval and image classification. A lot of work has been done to extend CLIP to support a variety of downstream tasks. ViLD 23 uses CLIP for detection. It first extracts the object proposal, and then sends the proposal to CLIP for classification. GLIP 24 improves ViLD by unifying object detection and phrase grounding into pre-training. This approach enables GLIP to learn jointly and enhance the performance of both tasks. These methods mainly make use of the relationship between images and texts to support more visual tasks and zero-shot capabilities.

Different from the above methods, MAE 25 uses self-supervised training. It masks some patches in the image and reconstructs it self-supervised. Segment Anything Model (SAM) 26 uses MAE as the image encoder, and relies on point, box, and mask as prompts to output the segmentation map of the corresponding area. It can also achieve good results in unseen scenes. SEEM 27 introduces a unified visual prompt, supports a wider range of prompts, and has strong composability of prompt. SAM is trained on natural images, and many researchers have fine-tuned and improved SAM using domain-specific data. MedSAM 28 is a medical image segmentation model, which fine-tunes the SAM decoder on a large-scale medical dataset, and its performance on the data set surpasses SAM. TAM 29 uses SAM and off-the-shelf detectors to segment and track video content. RsPrompter 30 uses semantic information to achieve automated instance segmentation of remote-sensing images. CAT 31 integrates pre-trained image captioners, SAMs, and LLMs, which can interpret the content in the region based on user prompts.

The above methods focus more on supporting more vision tasks, such as Visual Question Answering (VQA) 32 , image captioning, segmentation, tracking, etc. In addition, some methods fine-tune SAM on large-scale datasets in specific fields, such as remote sensing, medicine, etc., to improve the model’s performance. Few studies focus on visual perception in industrial scenarios with few samples, especially in HRC manufacturing tasks, where the parts in the scene are changeable. This paper hopes to achieve easily transferable industrial scene awareness using VFMs, to enhance the scalability of the HRC system.

Existing HRC system and research gaps

The HRC system is designed to solve manufacturing tasks in a flexible, efficient, and human-centered manner 33 , which could be divided into (1) scene perception, (2) task reasoning, and (3) action execution.

Perceiving the status of humans and the environment is a prerequisite for effective collaboration 34 . The research community mainly improves methods in computer vision in order to make it suitable for HRC scenarios and tasks. Wang et al. 35 use AlexNet 36 to recognize human actions and parts during the assembly process, providing a basis for robots to understand human actions and the environment. Sabater et al. 37 propose a novel gesture recognition method that has cross-domain and view-independent perception capabilities. It shows superior performance in various data sets. Dreher et al. 38 propose a graph network classifier to recognize object-action relations in bimanual human demonstrations. Ramirez-Amaro et al. 39 present a on-line system for semantic representation of activities using simple semantic rules instead of sophisticated algorithm, which could be implemented on a robot. Merlo et al. 40 exploit scene graph to extract and encode motion patterns and context, in order to recognize the hand-object and object-object interactions without prior knowledge.

Based on scene observation, task reasoning enables robots to produce appropriate codes, language instructions, or other forms of solutions. Zheng et al. 41 proposes a visual reasoning method to construct a knowledge graph in the manufacturing field, which can infer unseen but similar instructions based on the scene graph and has generalization. Diehl et al. 42 introduce a multi-stage method to generate the planning domain from human demonstrations automatically. Shirai et al. 43 propose a new framework that obtains problem description utilizing LLMs and VLMs and generates valid robot plans using symbolic planners. Lee et al. 44 present a disassembly sequence planning algorithm, which could plan and distribute disassembly tasks between humans and robots within plenty of vital factors. Yu et al. 45 model the HRC assembly working process using a novel chessboard setting, and use a reinforcement learning method to make high-level decisions for optimizing completion time.

Instruction-action mapping and action execution is the important part and ultimate goal in the HRC process 46 . Lagomarsino et al. 47 present a reinforcement learning approach to facilitate the perceived safety and social acceptance more than the physical effort needed. Lagomarsino et al. 48 exploit the B-spline trajectories and the multi-objective optimization problem to dynamically adjust the robot trajectories’ total execution time and smoothness based on the operator’s cognitive workload and stress. Ghadirzadeh et al. 49 proposed a HRC framework based on reinforcement learning, which takes human motion as input and trains an end-to-end feedback network based on human actions to provide the appropriate action with the goal of minimizing the total time. Brohan et al. 50 present Robotics Transformer that takes images and natural language instructions and outputs discretized base and arm actions, supporting a large-scale real-world robot tasks.

Existing methods still utilize pre-defined human–robot task flows to achieve HRC, which do not have enough generalization ability to complete unseen tasks, because of their small and specialised models. Recently, FMs have shown robust generalization capabilities in visual and language tasks. Therefore, this paper aims to address HRC problems using FMs. Specifically, for scene perception, leveraging the generalization capabilities of VFMs can effectively enable perception of new objects with data collection alone, without model retraining. For task reasoning, LLMs are utilized to enhance the task reasoning ability of the HRC system, which could handle unseen tasks using prompts with historical experience and robot APIs. For action execution, the robot directly follows the code generated by LLMs.

FMs based HRC framework

The architecture of the FMs-based HRC method is shown in Fig. 1 , including a perception module, a reasoning module, and an execution module. When performing manufacturing tasks, the workflow of the HRC system is as follows. Firstly, the perception module perceives human language instructions in real time, which are sent to the speech recognition model and converted into text descriptions. The environmental descriptions are updated at discrete frequencies, including object positions and robot status. Secondly, environmental descriptions and human instructions in the form of text are sent to LLMs, which infer robot control codes that comply with human instructions and environmental constraints based on the provided HRC prompt. The initial code will be feedback to humans for evaluation. For erroneous codes, humans and LLMs will interact to improve the code. When humans confirm that it is correct, the code will be sent to the host computer. Finally, the host computer executes the control code and completes the collaboration task using robot low-level APIs.

figure 1

Motivation of the FMs-based HRC method.

Problem statement

The problem statement of the HRC system based on FMs is described as follows. Given a set of robot low-level APIs, human language i , and environmental state s , LLMs need to reason out right robot code \(W=g(i,s)\) , and its generation process can be expressed as \(g(i,s)=\prod _{j=0}^{N} g(w_j|w_{<j},s,i)\) , where \(g(\cdot |\cdot )\) represents the generator of LLMs, and each newly generated content \(w_j\) is predicted based on the previous sequence \(\left\{ w_0, w_1, ..., w_{j-1}\right\}\) . The generated codes will call the robot’s low-level APIs and control the robot to execute the actions exec ( g ( i ,  s ),  APIs ). In this process, VFMs will provide fine-grained scene perception to assist the robot’s precise operations.

Although LLMs encode rich semantic knowledge of the real world, they may not produce answers that are consistent with reality. For example, when the robot is asked “How should I install this Slotted screw?” , it will usually answer “You can go to the hardware store and buy a Slotted screwdriver.” , which is a logical but unrealistic answer. To make robots correctly understand human instructions and infer code that complies with environmental constraints and human goals, three aspects need to be considered: (1) How to design low-level APIs for robots; (2) How LLMs perceive scene information; (3) How to make LLMs reason out control codes.

Design of low-level robot APIs

Reasonable low-level APIs are the foundation for LLMs to generate high-quality code. The combination of APIs should be able to handle most tasks, and they should also be as independent as possible, following atomicity, so that APIs can be reused by various robot tasks. Ideally, a set of common APIs should be designed that can be called by LLMs to solve tasks described by various human instructions. However, it is difficult to achieve, due to many complex task requirements in reality, which require some APIs additionally. For example, generating grasping poses based on object detection results is a particular API.

Therefore, robot APIs can be summarized into three categories, as shown in Table 1 . They are (1) Task-independent APIs, which define the basic functions of the robot. (2) Task-dependent APIs, targeting special tasks. (3) Configuration, which is a configuration file that stores variables and records key environment descriptions. For different tasks, users only need to provide new configuration files and design some particular APIs.

APIs are not only function implementations, but also interfaces exposed to LLMs, allowing LLMs to understand its functions and know how to call it. The best practices in this paper indicate that APIs should be described in concise and clear language, including the names, parameters, return values, and functions. For example, “get_storage_location (space_des, obj_name): Input space description and object name. Return the 6D pose for its storage location.” is an API description. It describes that to obtain the part storage location, the function “get_storage_location (*,*)” needs to be called, and the parameters scene description and object name is required. The variable type of formal parameters and return value are implicitly specified by the solution example.

Connecting perception to LLMs

Multi-dimensional perception is the basis for LLMs to understand the real world. In the experimental setting of this paper, perception can be divided into three categories: (1) scene observation; (2) robot status; and (3) human language intention.

(1) For scene observation, its description can be divided into two granularities, which act in different stages. What is fed into LLMs is a coarse-grained scene description, such as “Space[observation]: [part_space: [battery,...], ...]” , including parts, tools, and their locations, because LLMs are mainly responsible for thinking and reasoning, and do not involve robot trajectory and position control, etc. Introducing fine-grained and overly redundant information may affect the LLM’s performance. Therefore it is introduced in the code execution process, which is obtained by VFMs through APIs. It provides information, including object coordinates, relative position, etc., for the robot’s fine operation. In terms of scene updates, this paper uses LLMs to generate updated scene observations based on the control code. (2) For robot status, it can be obtained through the robot controller. However, this paper explores another way, resembling the update approach of scene observation, to proactively update the robot status through LLMs, discussed in Prompt for HRC environment updating. This achieves a unified and pure environment update method. (3) For human language instruction, it can be received by sensors and sent to the speech recognition model, which transfers audio signals to text. The speech recognition model will be called through wake words, and it will sleep when there is no signal input for a while. Finally, the text instruction is fed into LLMs for reasoning.

figure 2

The workflow of HRC system with algorithm embedded.

Based on environmental awareness, connecting multi-dimensional data to LLMs is an important part of task reasoning. However, LLMs can usually only handle text, and have poor understanding and processing capabilities of other modals. To support multi-modal data, LLMs need architecture design and modal alignment, which requires a lot of data and cost. For reasoning tasks, the focus is on connections between abstract concepts rather than visual representations. Therefore, it is reasonable to consider clear text in place of vision. This paper uses standardized templates to convert visual information into abstract text descriptions to indirectly support vision modal, e.g., Space[observation]:[tool_space: [“one slotted screwdriver”, ...], ...]; Robot[sensor]: [location: “deliver_space”, ...]; Human[instruction]: please open your gripper.

Connecting LLMs to robot

This paper hopes to build a LLM for HRC task reasoning, which can understand human language instructions and limit the output to a narrow skill set, that is, the defined low-level robot APIs, to avoid meaningless output. The research community has made a lot of explorations in fine-tuning LLMs, mainly including prompt engineering, prompt tuning 51 , instruction tuning 52 , etc. Instruction tuning may provide better results, but it often requires thousands of dollars in training costs and is not an economical option. Therefore, this paper uses prompt engineering to provide a well-defined prompt template for HRC task reasoning. See LLMs for task reasoning in HRC for details.

System implement

Based on the above problem statement and analysis, the system architecture designed in this paper is shown in Fig. 2 , and the workflow of the system is along the arrow sequence. First, the microphone acquires human speech signals in real-time. The speech is converted to text using a speech recognition model, and it is fed into LLMs along with the initial environment description. Secondly, LLMs based on HRC prompt template infer robot control code and updated environment description. Thirdly, the robot code is displayed to humans through an AR glass or a screen for evaluation. For erroneous codes, humans and LLMs will interactively correct them. Fourthly, the correct code is sent to the Python interpreter for execution, which will call VFMs through APIs at the appropriate time to get fine-grained scene perception. Finally, the robot completes the collaborative task, while the updated environment description is stored for the next instruction from humans.

LLMs for task reasoning in HRC

Llm preliminaries.

Based on the analysis in Connecting LLMs to robot, this paper selects prompt engineering to make sure LLMs infer robot control codes. Prompt engineering constrains the output of LLMs by providing a domain-specific prompt template, prefixed to user requests, which describes the common setting of similar problems. LLMs model language sequences as a probability distribution: \(p(W)=p(w_0, w_1, ..., w_N)\) , where the selection of words \(w_j\) forms a linguistically meaningful sequence W as much as possible. This is typically accomplished using chain rule decomposition: \(p(W)=p(w_1)p(w_2|w_1)\cdot \cdot \cdot p(w_n|w_1, w_2, ..., w_{n-1})\) , where the generation of new content is expressed as a conditional probability. Therefore, by providing appropriate prompt templates, it is possible to constrain LLMs in the generation of subsequent content: \(p(W)=\prod _{j=0}^{N} p(w_j|w_{<j}, prompt)\) .

A well-defined prompt template needs to follow the characteristics of simplicity, clarity, and perfection. Providing such prompt templates can prompt LLMs to achieve more accurate and biased responses, enabling LLMs to excel in specific domains. The best practices outlined in this paper indicate that constructing a well-defined HRC prompt template needs to follow a structured construction approach. Additionally, it is recommended to incorporate strategies such as Chain of Thought (CoT) 53 , Few-shot prompting 54 , and Step of Thought (SoT) that proposed in this paper. SoT represents a fine-grained thinking process, requiring LLMs to output comments before generating each line of code.

figure 3

HRC prompt template.

Prompt for HRC task reasoning

The structure of the HRC prompt template is illustrated in Fig. 3 . On the left side are the specific values and API implementations corresponding to variables in the prompt template. They are not explicitly specified in the prompt template, but are presented as a configuration file or robot APIs, awaiting code calls that are generated by LLMs. On the right side is the content of the prompt template. In general, the content is categorized and expressed through subheadings and contextual lists. This paper explains the meaning and design philosophy of each part in order. (1) Role: At the outset of the prompt, it describes the task background, similar to hypnosis or role-play, which facilitates the swift adaptation of LLMs to a specific domain. This part is set as the system variable of the LLM, serving the purpose of specifying the task domain throughout the entire conversation, mitigating the risk of forgetting. (2) Region: It describes the HRC environment settings, including the robot’s direction, coordinate system, and key locations. This provides support for the robot in comprehending directional instructions from humans and executing tasks in specified areas. (3) Robot APIs: It is a list of APIs exposed to LLMs, enabling LLMs to comprehend which APIs are accessible and preventing the generation of hypothetical APIs. See Design of low-level robot APIs for more details. (4) Environment: Herein, an abstract definition of scene observation and robot state is presented, establishing a foundation for LLMs to comprehend specific input content. (5) Output format: LLMs are informed of output content and output specifications, including the generation of thinking process, and asking questions when facing uncertain queries. The output is standardized in the form of tags and content to facilitate code analysis. (6) Examples: Some examples of solutions are provided here, where environmental states and human instructions are fed into LLMs along with their correct responses. Compared to the above complex task descriptions and background settings, solution examples are more effective. LLMs generate responses based on conditional probabilities, which may mean that it searches for similar examples, or interpolates among solution examples to address new problems. So the solution examples should be as differentiated as possible, which helps expand the exploration space of LLMs.

Prompt for HRC environment updating

The environment description and human instruction are constantly updated, as shown by gradient color in Fig. 3 . All of them are fed into LLMs for response. Space[observation] should be obtained through the perception module, while this paper explores an alternative method, due to the limitation that the camera cannot capture all scenes in the experiment. In the experiment setting, where only the robot itself can cause state changes in the region covered by Space[observation] . Therefore, it is reasonable to let LLMs autonomously update the scene observation Space[observation-update] based on the original environment description and HRC code. Specifically, this paper explores two methods for updating scene observation, namely Union and Decomposition . Union aims to directly generate Space[observation-update] afterward Robot[code] using a LLM with combined prompt. Decomposition employs two independent LLMs with two different prompts, one of which is for generating HRC codes and another for producing Space[observation-update] based on Robot[code] . Experimental results indicate that utilizing the Union method yields better outcomes and has a time advantage. Evaluation of task reasoning compares the accuracy of these two methods.

VFMs for semantic perception in HRC

Multi-model fusion for semantic segmentation.

Scene perception is the basis for robots to execute actions. Specialized models handle perception tasks through supervised learning. However, they lack generalization, requiring re-annotating data and retraining models for new objects. Recently, VFMs have shown strong generalization, which can quickly and easily transfer to new scenes without additional training.

SAM (Segment Anything Model) 26 is a segmentation model with exceptional generalization capabilities that can segment unseen scenes, but it cannot understand semantics. CLIP 8 aims to model the relationship between images and text. However, it is trained on internet datasets, making it challenging to find corresponding text descriptions for industrial objects. Considering that CLIP establishes associations between images and text, the features it extracts resemble semantic representations. Let there be an object image i and its textual label t . After undergoing CLIP encoding, the features for the image and text are denoted as \(E_I(i)\) and \(E_T(t)\) respectively, where \(E_I(\cdot )\) and \(E_T(\cdot )\) representing CLIP’s image encoder and text encoder. The cosine similarity among these features: \(\left\langle E_I(i), E_T(t) \right\rangle \rightarrow 1\) , where \(\left\langle \cdot ,\cdot \right\rangle\) denotes the cosine similarity. For another image \(\hat{i}\) , there is also exist \(\left\langle E_I(\hat{i}),E_T(t) \right\rangle \rightarrow 1\) , thence the among diverse view images of the object also exists \(\left\langle E_I(\hat{i}), E_I(i) \right\rangle \rightarrow 1\) . Therefore, for industrial objects that are difficult to describe with text, it is reasonable to use multi-view images as labels instead of text.

figure 4

View-independent semantic segmentation with VFMs.

This paper hopes to leverage the generalization capabilities of VFMs to help the HRC system easily adapt to new scene perception. SAM excels in image segmentation for unseen scenes, while CLIP is good at the semantic understanding of individual objects. Integrating these two models can achieve scene comprehension without additional training, as shown in Fig. 4 . Firstly, a raw image is input into SAM to obtain a segmentation map without semantic information. Secondly, the CLIP encoder extracts image features from all masked regions in the segmentation map. Finally, these segmented regions are mapped to view-independent dimensions through the minPCA module, then sent to the Semantic Anything module (Fig. 6 ) along with view-independent labels of various categories in the dataset. By comparison, the category with the greatest similarity is considered as the semantic label of each segmented region.

Enhancing view-independent perception capability

CLIP encodes images into view-independent features, but it still has limitations. For industrial parts without clear semantic labels, its view consistency representation capability is limited, often leading to the misclassification of the same object’s different views as distinct objects. Therefore, to enhance CLIP’s view-independent feature extraction capability, a natural approach is to either (1) disperse features among various objects, or (2) bring closer the features of the same object’s different views. Principal Component Analysis (PCA) is a widely used feature mapping and dimensionality reduction algorithm designed to compress sample features into dimensions with significant differences. The compressed dimension \(v_{pca}\) is constrained by the number of samples n and the initial feature dimension \(v_{ori}\) , with the maximum value of \(v_{pca}\) being the minimum between the number of samples and the initial feature dimension, i.e., \(\max (v_{pca}) = \min (n,v_{ori})\) . When the number of object categories n is much smaller than the initial feature dimension \(v_{ori}\) , PCA compresses features into quite small dimensions. Such low-dimensional features cannot represent categories and are challenging for classification tasks.

figure 5

MinPCA reduces the differences of multi-view feature.

Therefore, this paper aims to improve classification accuracy by bringing closer the features from different views of the same object, rather than enlarging the feature differences among different categories. This paper uses the minPCA module to obtain the view-independent features of the part and the transformation matrix used for feature mapping, as shown in Fig. 4 . Specifically, The pseudo code of minPCA for a category is outlined in Fig. 6 . Firstly, segmentation maps from different views of the object are extracted. Then, features from CLIP-encoded segmented images are used to extract linearly independent features and the transformation matrix. In contrast to PCA, which selects the top k dimensions, minPCA selects the last k dimensions for feature mapping, because the objective is to identify feature dimensions with small variations across distinct views. Furthermore, the transformation matrix is applied to the original multi-view features, followed by averaging and normalization to obtain view-independent category features. Finally, the transformation matrix and view-independent features are saved for later use during the inference stage.

This paper collects some samples for verification, as shown in Fig. 5 . The red numbers represent the differences among the original features extracted by CLIP, while the green numbers represent the differences among features enhanced by minPCA . It can be observed that the feature differences among the same object’s different views significantly decrease. Although the feature differences among objects also decrease, their magnitudes are significantly smaller. This indicates that minPCA can reduce feature differences among views, consequently relatively increasing feature differences across categories.

In the inference stage, the features of the various segmented regions are sent to the Semantic Anything for classification, as shown in Fig. 6 . Specifically, for segmented regions extracted by SAM, their features are mapped to the different feature spaces through minPCA . Then cosine similarities are calculated between input features and feature labels of each category. The category with the greatest similarity is the region’s category. Although similarity needs to be computed separately for each category, multiple segmented regions can be processed in parallel, and it is a purely numerical calculation without much computational overhead.

figure 6

Pseudo codes of minPCA and semantic anything .

Results and discussion

Evaluation of task reasoning.

This paper created a HRCA-Code dataset for code evaluation generated by LLM based on the HRC prompt template. It is common HRC tasks in the experimental setting, including 26 simple tasks, 24 medium tasks, and 16 difficult tasks, as presented in Table 2 . Task difficulty is categorized based on three aspects: (1) the number of steps required for the task—the more code needed, the more challenging the task, (2) the introduction of environment information—the more constraints on the code based on environmental conditions, the more challenging the task, (3) Task complexity—the higher the abstraction level of task instructions, the more difficult it is, which LLMs to think deeply instead of simply following instructions. Each task includes scene observation (objects’ positions in the scene), robot status (position, open/close, grasped object, velocity), human instructions, and correct Python code for these queries.

For the approach of automatically updating the environment, discussed in Prompt for HRC task reasoning, this paper introduces an additional HRCA-Env dataset to assess the quality of LLMs in environmental updates. Tasks in the HRCA-Env dataset are derived from the HRCA-Code dataset, including 17 medium and all difficult tasks, excluding those not involving robot movement. The labels for these tasks are the correct environmental descriptions after executing the right code.

Experimental details

This paper conducts experiments using the APIs of gpt-3.5-turbo and gpt-4-0613 . For each task, LLMs are reinitialized to avoid the impact of historical conversations. The hyperparameter temperature is set to 0.2 for stable output and the questioning ability for uncertain human instructions. The remaining parameters use default values. Since the Claude2 API cannot be accessed, the experiment was conducted on the web page. For each task, a score of 1 is assigned for generated code that is correct, and 0 for incorrect code.

Tables 3 , 4 , 5 , 6 and 7 report the main experimental results on the HRCA-Code and HRCA-Env datasets. Table 3 shows the best experimental results, utilizing the best prompt strategy combination, appropriate solution examples, the highest-performing LLMs, and a suitable environment updating approach. The results indicate that LLMs achieve high success rates in generating correct code upon initial output, and can achieve perfect results after the evaluation and feedback of humans. In addition, it also has perfect accuracy in generating environment descriptions. Therefore, it is feasible to use LLMs as the understanding and reasoning unit of HRC systems.

In Table 4 , using GPT-3.5 as the baseline, this paper analyzes the impact of different prompting strategies on performance. The results show that CoT, Few-shot, and SoT all contribute to improving LLMs’ performance, and their combined use can further enhance it. In the CoT-1shot setting, LLms are required to generate code directly without any comment, whereas in the CoT-1shot-SoT setting, LLMs are encouraged to generate comments. In Table 5 , the influence of the number of solution examples on the results is analyzed. While more examples are generally beneficial, the goal is to accomplish the task with as few examples as possible. Furthermore, it can be observed that the number of examples is particularly effective in solving difficult tasks. Table 6 evaluates the impact of LLMs with different performance and human feedback on the HRC code generation. All models use the CoT-1Shot-SoT prompt strategy, and GPT-4 exhibits the best performance. Table 7 discusses the effects of union prompt and decomposed prompt methods on environment updating, using GPT-3.5 model and CoT-3Shot-SoT strategy. The union prompt method outperforms the decomposed method, possibly because the union prompt can leverage more background, including human instructions, the thought process, etc. This also reflects natural language is easier for LLMs to understand, compared with code.

Based on experimental cases, there are some additional insights worth discussing to provide references for designing more suitable HRC prompts. In summary, the paper found that the following factors also influence the performance of LLMs in code generation and are worth further exploration: (1) API definition: Clear definitions of APIs are important. Not only their functions but also details such as parameters and return value types are required for LLMs. Avoid overloaded APIs, which may result in LLMs easily making errors. (2) Detailed instructions: Task descriptions filled with details generally yield better results. Providing LLMs with key steps in tasks is particularly effective for weaker LLMs, such as GPT-3.5. (3) Historical conversation: Whether historical conversation helps answer new questions is still debated. In some cases, such as GPT-3.5 frequently misunderstands 6D poses as 3D coordinates, the history of human feedback on this error can help LLMs avoid similar mistakes in subsequent outputs. However, for complex tasks, multiple rounds of human feedback are required to correct mistakes, and the introduction of errors may have adverse effects on new tasks. (4) Crashes on simple tasks: On simple tasks, more examples sometimes decrease performance, due to that additional examples introduce bias, while fewer examples prompt LLMs to generate code directly without excessive deliberation.

Evaluation of semantic segmentation

This paper creates HRCA-Obj dataset, including comprising segmented images of objects, tools, and warehouse, totaling seven categories, as shown in Fig. 7 . A part of this dataset serves as the forward input for VFMs, extracting view-independent features as labels, with approximately 600 images per category. The other part is used to test algorithm performance, with around 40 images per category, where the proportion of occlusion and unocclusion data is roughly equal. The paper developed a semi-automatic annotation tool for annotating a large number of images. Specifically, it first records videos of objects surrounded by a green screen. Then, 600 frames are extracted for each category, at regular intervals. Background removal is achieved using a combination of OpenCV and the u2net 55 algorithm. Finally, the segmented images are manually evaluated and SAM is used for manual segmentation for poor automatic results.

figure 7

Examples of HRCA-Obj dataset.

This paper only evaluates the classification accuracy of segmented images and not the accuracy of image segmentation, due to an off-the-shelf SAM model being utilized. Classification using raw features extracted by CLIP is the baseline method. Further, this paper uses randomly generated basis transformation matrices to map the original features, as a control group for mapping using PCA. Lastly, PCA was used to map original features to different dimensions for evaluation.

The paper introduces an evaluation metric called Feature Consistency Ratio (FCR), representing the ratio of feature differences among different objects, and that among distinct views of the same object. A higher \(FCR(I_c)\) indicates greater feature differences among different categories. The formulas for FCR are given in Eq. ( 1 ) and Eq. ( 2 ), where \(C=\{1,...,c,k\}\) is the set of categories, \(I_c\) is the set of test images for category c , \(f_k\) is the feature label for category k , \(E_I(\cdot )\) is CLIP’s image encoder, and \(\left\langle \cdot ,\cdot \right\rangle\) denotes cosine similarity between features.

Table 8 reports the results of various methods on the HRCA-Obj dataset. Baseline exhibits higher classification accuracy than using random dimension mapping, i.e., Random(256) . This aligns with expectations, as random dimension mapping alters the spatial dimensions of initial features, sometimes weakening the FCR and making classification more challenging. In comparison to the baseline method, PCA(-256) demonstrates improved classification accuracy, with an average accuracy exceeding the baseline by 4.7%. Here, (-256) indicates mapping the original features to the first 256 dimensions with smaller variations. The results suggest that separately mapping original features to view-independent dimensions for different classes can reduce feature differences among distinct views of the same object, thereby avoiding misclassification.

Furthermore, this paper evaluates the classification results of mapping original features to different dimensions, including dimension quantity and direction. PCA(512) maps the original features to the first 512 dimensions with significant variations, which will increase differences between different views and should decrease classification accuracy. However, possibly due to the origin feature dimensions being enough large, PCA(512) maintains the same accuracy as the baseline method. It can be observed that the average FCR of PCA(512) and the baseline method are consistent, as shown in Table 8 . Furthermore, it can be found that the performance of PCA(256) is not as good as the baseline method due to weakening FCR. PCA(-128) exhibits a performance decrease compared to PCA(-256) , but still outperforms the baseline method, because the small dimensions are insufficient to represent object features. Table 8 also reports the FCR of HRCA-Obj dataset, showing a decreasing trend as the dimensions are mapped in the direction of smaller variations. With the comparison of Acc. and FRC in the table, the classification accuracy and FCR show a positive correlation. This quantitative evidence suggests that mapping features to dimensions with smaller view differences can enhance classification accuracy.

Validation of HRC assembly task

Environment and task setup.

Figure 8 introduces the experimental environment, consisting of four workspaces: assembly space for collaborative assembly by humans and robots, tool space for storing tools, part space for storing parts to be installed, deliver space for handover between humans and robots, and a set of part boxes to store fasteners. The hardware includes a FR3 robot, a robot controller, a Realsense D435i camera at the robot’s end-effector for scene perception and visual grasping; a HoloLens glass or a screen for receiving and displaying code generated by LLMs; a Bluetooth microphone for receiving speech. An upper computer communicates with the server and robot for transmission and control. A GPU server runs deep learning algorithms.

figure 8

HRC system setup for parts assembly.

This paper designs an assembly task of a satellite component model, which is a simplified model of the real product, to verify the proposed HRC system. The assembly task advocates a human-centered assembly mode, which follows human instructions without a specific assembly sequence. The satellite component model is composed of 1 battery, 4 stringers, 1 framework, and 1 signalboard. Fasteners include 6 slotted screws and 12 hex screws, as shown on the right side of Fig. 9 , with abbreviations and quantities. It also introduces a common assembly process, but it is not the only installation sequence. Firstly, install two stringers on one side of the framework; move the battery to the center of the framework. Secondly, connect the battery to the initial two stringers, and install the last two stringers on the other side of the framework. Thirdly, connect the newly installed stringer with the battery. Fourthly, install the signalboard on the side of the framework and connect it to the battery and framework through slotted screws. Finally, the assembly of the satellite component model is finished.

figure 9

The assembly process and part relationship of satellite component model.

Validation of HRC assembly

The feasibility and effectiveness of the foundation-based HRC system are verified through the HRC part assembly case of the satellite component model. Considering the experimental process, the paper reports respectively: (1) inferring robot control code based on human instruction using LLMs, as depicted in Fig. 10 . (2) Recognizing human finger pointing using VFMs, described in Fig. 11 . (3) Grasping parts and tools using VFMs, as shown in Fig. 12 and Fig. 13 . (4) Describing the workflow of the HRC system through a part assembly case, as illustrated in Fig. 14 . (5) Comparison with related method, as shown in Table 9 . The following sections will provide detailed explanations of each function.

figure 10

HRC code reasoning based on LLMs.

HRC code reasoning

Figure 10 visualizes the HRC code generated by LLMs based on the HRC prompt (Introduced in LLMs for task reasoning in HRC), including two typical collaboration tasks. The gray region represents the input, containing the environment description and human instruction. The blue region shows the inferred collaborative code, while the yellow region describes the updated environmental description. Specifically, the left side of Fig. 10 depicts that the robot will grasp a hex screwdriver and hands it over to human, when the human picks a hex screw and ask for help. On the right side, it illustrates that the robot is going to detect the assembly location pointed by the human finger, then grasp the stringer and move to that location following the human’s short assembly instruction.

figure 11

Assembly location recognition based on VFMs.

Assembly location recognition

In Fig. 11 , an example of detecting the assembly location pointed by human using the VFMs-based semantic segmentation method (Introduced in VFMs for semantic perception in HRC) is reported. The inspiration for this assembly approach is from a common problem in assembly: For tasks with free-form, no fixed order, and identical parts, it is challenging for humans to express assembly location through language. Asking humans to remember the description of all assembly positions, is impractical, especially facing a large number of parts. However, recognizing locations through the interaction between human fingers and reference objects is intuitive and meets assembly requirements. In this case, humans can easily convey assembly intention. Firstly, a reference object is required, which typically is a fixture, while in this experiment it is a framework. Then, let a tuple \(\{ X_{loc}^{2d}(i),X_{loc}^{6d}(i) \}\) represent the pixel coordinates and spatial 6D coordinates of assembly position i . Then, by calculating the minimum distance between the human finger and the pixel coordinates of all assembly positions \(i\in I\) of the reference object: \(i=\mathop {\arg \min }_{i\in I} \Vert X_{ptd} - X_{loc}^{2d}(i) \Vert\) , the assembly location \(X_{loc}^{6d}(i)\) determined, where \(X_{ptd}\) denotes the pixel coordinate at the top of human finger in the segmentation map. In experiments, this process can be implemented using API: get_pointed_assembly_location(scene_description) . In addition, this paper also explores determining the location using LLMs instead of APIs, which can give reasonable HRC code in most cases.

figure 12

VFMs for part recognition and grasping.

Robot visual grasping

VFMs-based semantic segmentation method is also beneficial to robot visual grasping, as shown in Figs. 12 and 13 . This paper adopts 2D planar grasping with the following process: Firstly, get semantic segmentation of the images captured by Realsense. Then, the grasp pose, comprised of 3D location and 1D angle, is obtained from the point cloud and 2D segmented region. Furthermore, the grasping pose is transformed from the camera system to the robot system using the matrix from hand-eye calibration. Finally, the robot is controlled to execute the grasping action.

Figure 12 illustrates the process of part grasping. This paper uses OpenCV to get the minimum bounding rectangle of the object. The 1D angle perpendicular to the main direction of the rectangle is considered as 1D angle. For simplicity in the experiment, the signal interface board serves as the reference, and all parts use the same 1D angle, as shown in Fig. 12 b. The 3D location is obtained by calculating the mean of the surface point cloud of the corresponding part, as shown in the yellow area in Fig. 12 c. Due to calibration errors, RGB and depth images cannot be aligned accurately. As a result, some points that do not belong to the object will be introduced by mistake, resulting in calculation deviation of 2D location. By filtering the depth ( \(z-\) axis) of the point cloud, these points cloud can be excluded, as shown in the blue area in Fig. 12 c. Finally, 3D location combined with the 1D angle, the robot can grasp the parts.

Figure 13 shows the process of grasping tools, which is similar to parts grasping. The only difference is the acquisition of the 3D location. Most of the tools are low-height objects, and the perspective phenomenon of the camera is not obvious. It is enough to get the 2D-pixel grasp center using only RGB images without point cloud, which lies in a fixed proportional offset from the center of the rectangular to the direction of the handle. Then the 3D location is obtained through the pixel grasp center. The 1D angle is also obtained using OpenCV, with the direction perpendicular to the main axis of the rectangle of tools.

figure 13

VFMs for tool recognition and grasping.

Part assembly case of HRC

Figure 14 illustrates the assembly process of the stringer involving three interactions of HRC. The sequence of events is described as follows: (1) The robot has no action, and the human instructs the FMs-based HRC system through language, “OK robot, please first get the assembly location where my finger is pointing and then move the stringer to that location.”, where “OK robot” is the wake words for speech recognition. Task-1 starts. (2) The code inferred by the LLMs and evaluated by humans is sent to the robot for execution. The human points to the desired assembly location, and the robot executes the code to obtain the assembly location pointed by the human. Task-1 is in progress. (3) The robot is still executing the code, which moves the stringers to the assembly position. The human picks up the hex screws and tells the system, “OK robot, I pick up hex screw, pass me the appropriate tool.” Task-1 is in progress, Task-2 starts and blocks, waiting for the completion of Task-1. (4) The robot completes the code and the stringer is now in its assembly location. Task-1 is completed. Task-2 is in progress. (5) The robot executes the code that delivers the hex screwdriver to the human, waiting for subsequent instructions. Task-1 and Task-2 are completed. (6) The human picks up the screwdriver and tells the system, “OK robot, please open your gripper.”. The system produces the code and controls the robot to release the gripper. Task-1 and Task-2 are completed. Task-3 is from start to completion. (7) The human assembles the stringer independently. (8) The human and robot complete the assembly of the stringer.

figure 14

FMs-based HRC system for part assembly.

Performance of HRC case

This paper selects key tasks in collaborative assembly to evaluate the performance of the HRC system, a total of 17 tasks, which are the tasks involved in assembly experiments, i.e., “please grasp the signal interface board”. They are divided into four categories according to visibility and whether they involve visual perception, namely STPO, UTPO, STNP, and UTNP. Specifically, S een T ask means that the task appears in the prompt’s examples or has a predefined code, and P lan O nly means that the completion of the task only requires planning without perception. Since the successful execution of the task involves two modules, planning and perception, this paper discusses both their performance.

This paper compares the Baseline and two different versions of CaP 17 . Among them, the Baseline means that only pre-programmed methods are used without any FM, which is also the main method of the current HRC system. CaP provides multiple versions, and two of them are chosen for the experiment, one without a vision module, and the other using ViLD 23 as the vision module, where this paper uses the characteristic of the object as the text label of the VLMs instead of its professional name, for example, a red screwdriver instead of a slotted screwdriver, and a white block instead of a battery pack. The results are shown in Table 9 , which indicates that the traditional HRC method can only handle a limited number of tasks without the ability to solve unseen tasks, while the method enhanced by LLMs can increase the generalization ability of the system and can reason about undefined tasks. Our method provides more generalized task reasoning and higher-precision visual perception by introducing LLMs and VFMs.

Exist challenges and directions for improvement

Potential failure modes.

The failures of the system during operation can be mainly divided into three categories. (1) Speech recognition failure: The speech recognition model fails to correctly recognize human language. This type of failure can be solved by restating. (2) Code reasoning failure: After multiple interactions between the human and LLMs, LLMs still cannot infer the correct code, which may be caused by too challenging tasks. This needs to be solved in the future by improving the performance of LLMs or VFMs. (3) Program execution failure: During the program execution process, the camera misidentifies the object or obtains incorrect depth information due to lighting and other reasons, resulting in grasping failure. At present, the only way to avoid potential collision risks is by emergency stop. Active obstacle avoidance and emergency stop will be considered in the future.

This paper qualitatively evaluates the impact of lighting, object position, and occlusion on system stability. Lighting mainly affects depth detection, thereby affecting the program’s execution results. When the light is straight, i.e., the camera shooting and the light source are in the same direction, the depth image will produce noise. When the light is side-lit, i.e., the light direction is at a certain angle to the camera shooting direction, there will be noise and local depth data missing. When object is reflective, such as metal surfaces, local depth data missing is common. However, in our experiments, these phenomena are not significant, and the average depth value of the object surface area can be used to avoid the loss of local depth data. The position and occlusion of the object have impact on the recognition, but this paper uses the method described in VFMs for semantic perception in HRC to enable it to support recognition under different views and occlusion conditions, as shown in Fig. 15 .

In addition, the impact of environmental noise on HRC systems deserves consideration, particularly since the proposed HRC system relies on speech interactions. Specifically, electrical and equipment noise are the primary sources. Additionally, robot movements also generate noise, especially during the start and stop phases, while video equipment hardly produces sound. Speech recognition consists of two phases: detection and recognition, with noise primarily affecting the detection phase. In our experiments, noisy environments more frequently resulted in unrecognized wake words compared to quiet settings. However, this issue can be mitigated through fuzzy string matching of wake words and a higher volume from the operator, so the impact is not significant. The accuracy of speech recognition is almost unaffected by noise. Notably, the presence of undetected wake words introduces uncertainty for operators, highlighting an area for further research to improve user experience.

figure 15

Recognition under occlusion and different views.

Extended to more complex tasks and scenes

Users can expand the system to support extensive tasks or parts and tool sets following the details in “ Methods ” section. Specifically, (1) Task reasoning: Users utilize previous prompt and robot low-level APIs, whereas providing new configuration files, where some key locations are recorded in new tasks; task-related APIs, which are used to handle certain complex tasks that cannot be completed only through low-level APIs; new scene descriptions and solution examples, which can help LLMs better comprehend new tasks and scenes and enable them to generalize to other unseen tasks. (2) Visual perception: Users collect new datasets for new parts or tools, then update the original dataset, and perform visual recognition without additional training.

The purpose of this paper is to explore the application of FMs in the field of HRC assembly in industry. Nowadays, the generalization of FMs and the complexity of the tasks they support are still a trade-off. LLMs can only generalize between relatively simple or similar robot tasks, whereas are not yet competent for complex tasks involving a large amount of code and fine operations, such as positioning and screwing, where fine positioning and complex control are necessary. VFMs are trained on daily data, which exists a domain gap. Although this paper uses metric learning and PCA methods to enable it to recognize different views and occluded industrial parts, the recognition accuracy still needs to be improved, especially for similar objects. Figure 16 shows some success and failure cases. When a Phillips screwdriver was added to the toolset and the dataset was expanded, the HRC system could recognize and grasp it. However, misidentification is prone to occur, if only adding new objects without expanding the dataset.

figure 16

Success and failure cases for extensive tools.

User research was conducted to more comprehensively evaluate the HRC system from the user’s perspective, including questionnaires and focus groups, in order to provide directions for future improvements. As for the questionnaire, this paper surveyed a total of 8 users, all of whom were technicians or people with relevant technical backgrounds. After introducing them to the aim and function of the HRC system, and showing them the demonstration videos, users evaluated the existing system from various aspects, inspired from Human–Robot Collaboration—Quality Index (HRC-QI) 56 , such as Interaction ability (I) and User eXperience (UX). The results are shown in Table 10 , the Overall Score (OS) of our HRC system obtained is 3.563/5, indicating that our system achieved a moderate to high level of HRC performance. Moreover, focus groups were organized with the same respondents to more specifically analyze the limitations of the existing methods and propose improvements for further research, mainly including (1) Interaction: For evaluating the code, AR glasses are not a convenient interaction method. In fact, the AR glasses sometimes also had communication problems with the computer, so we added a screen in front of the platform. In the future, a better interaction experience can be provided through a screen located on the desktop. Due to instances where wake words are not detected, operators may experience significant cognitive strain. This issue necessitates improvement through enhanced interaction methods. (2) Response efficiency: The response time of the code is long. Users suggest that the code can be stored for subsequent calls of the same task, or selecting LLMs with faster response speed instead. (3) Safety: Although the robot runs at a lower speed, there is currently only one safety measure, i.e., emergency stop, and collision detection and other methods should be introduced in the future.

Comparison with different HRC system

In addition to the FMs-based HRC system proposed in this paper, which incorporates LLMs and VFMs, other HRC systems also enhance interaction and collaboration between humans and robots through various approaches such as Action Recognition (AR) 57 , Speech Recognition (SR) 58 , Extended Reality (XR) 59 , and Touch Panels (TP) 60 . Consequently, we compared the characteristics of these technologies from different perspectives, as shown in Table 11 . specifically, action recognition enables flexible and intuitive engagement; XR offers detailed and immersive interactions; speech recognition facilitates clear and accurate communication; touch panels provide detailed information to ensure safe interactions. Overall, like these methods, the proposed HRC system also focuses on enhancing user experience and effectively completing tasks through collaboration. The distinction lies in the fact that the other methods primarily focus on specific tasks and scenarios. In contrast, the proposed approach emphasizes task comprehension and reasoning, as well as generalization, to support diverse natural language instructions and unseen tasks. In the future, the integration of FMs with these technologies could further promote more generalized, flexible, reliable, and user-friendly human-robot collaboration.

In order to enhance the generalization of the HRC system, this paper proposes a FMs-based HRC system, which could adapt to new scenes easily and support reasoning for unseen tasks. This paper introduces the design of the system and analyzes the roles of LLMs and VFMs. LLMs are adapted to HRC tasks through prompt engineering, supporting autonomous reasoning for undefined tasks. A VFMs-based semantic perception is proposed to support the recognition of unseen objects without training, and its performance is enhanced by PCA. As a result, the FMs-based HRC system can generate HRC codes and control the robot for collaboration based on human instruction. Extensive experiments indicate the performance of LLMs on HRC task reasoning and VFMs on scene semantic perception, respectively. Finally, the feasibility and effectiveness of the system are validated through the HRC part assembly case of the satellite component model.

Data availability

The datasets generated and/or analysed during the current study are available in the Github repository. Availability Link: https://github.com/yuchen-ji/assemblyhelper .

Boden, M. A. Artificial Intelligence (Elsevier, 1996).

Google Scholar  

Sisinni, E., Saifullah, A., Han, S., Jennehag, U. & Gidlund, M. Industrial internet of things: Challenges, opportunities, and directions. IEEE Trans. Ind. Inf. 14 , 4724–4734 (2018).

Article   Google Scholar  

Leng, J. et al. Digital twins-based smart manufacturing system design in industry 4.0: A review. J. Manuf. Syst. 60 , 119–137 (2021).

Ajoudani, A. et al. Progress and prospects of the human-robot collaboration. Auton. Robot. 42 , 957–975 (2018).

Leng, J. et al. Industry 5.0: Prospect and retrospect. J. Manuf. Syst. 65 , 279–295 (2022).

Zhou, J., Zhou, Y., Wang, B. & Zang, J. Human-cyber-physical systems (hcpss) in the context of new-generation intelligent manufacturing. Engineering 5 , 624–636 (2019).

Shridhar, M., Manuelli, L. & Fox, D. Cliport: What and where pathways for robotic manipulation. In Conference on Robot Learning 894–906 (PMLR, 2021).

Radford, A. et al. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning 8748–8763 (PMLR, 2021).

Jiang, Y. et al. Vima: Robot manipulation with multimodal prompts. In International Conference on Learning Representations (2023).

Brohan, A. et al. Do as I can, not as I say: Grounding language in robotic affordances. In Conference on Robot Learning 287–318 (PMLR, 2022).

Huang, W. et al. Grounded decoding: Guiding text generation with grounded models for robot control. Preprint at http://arxiv.org/abs/2303.00855 (2023).

Lin, K., Agia, C., Migimatsu, T., Pavone, M. & Bohg, J. Text2motion: From natural language instructions to feasible plans. Preprint at http://arxiv.org/abs/2303.12153 (2023).

Huang, W. et al. Voxposer: Composable 3d value maps for robotic manipulation with language models. Preprint at http://arxiv.org/abs/2307.05973 (2023).

Driess, D. et al. Palm-e: An embodied multimodal language model. Preprint at http://arxiv.org/abs/2303.03378 (2023).

Chowdhery, A. et al. Palm: Scaling language modeling with pathways. J. Mach. Learn. Res. 24 , 1–113 (2023).

Sajjadi, M. S. et al. Object scene representation transformer. Adv. Neural. Inf. Process. Syst. 35 , 9512–9524 (2022).

Liang, J. et al. Code as policies: Language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA) 9493–9500 (IEEE, 2023).

Sun, C., Shrivastava, A., Singh, S. & Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of the IEEE International Conference on Computer Vision 843–852 (2017).

Deng, J. et al. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).

Zhu, Y. et al. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE International Conference on Computer Vision 19–27 (2015).

Chen, X., Fan, H., Girshick, R. & He, K. Improved baselines with momentum contrastive learning. Preprint at http://arxiv.org/abs/2003.04297 (2020).

Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. Preprint at http://arxiv.org/abs/2010.11929 (2020).

Gu, X., Lin, T.-Y., Kuo, W. & Cui, Y. Open-vocabulary object detection via vision and language knowledge distillation. Preprint at http://arxiv.org/abs/2104.13921 (2021).

Li, L. H. et al. Grounded language-image pre-training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 10965–10975 (IEEE, 2022).

He, K. et al. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 16000–16009 (IEEE, 2022).

Kirillov, A. et al. Segment anything. Preprint at http://arxiv.org/abs/2304.02643 (2023).

Zou, X. et al. Segment everything everywhere all at once. Preprint at http://arxiv.org/abs/2304.06718 (2023).

Ma, J. et al. Segment anything in medical images. Nat. Commun. 15 , 654 (2024).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Yang, J. et al. Track anything: Segment anything meets videos. Preprint at http://arxiv.org/abs/2304.11968 (2023).

Chen, K. et al. Rsprompter: Learning to prompt for remote sensing instance segmentation based on visual foundation model. Preprint at http://arxiv.org/abs/2306.16269 (2023).

Wang, T. et al. Caption anything: Interactive image description with diverse multimodal controls. Preprint at http://arxiv.org/abs/2305.02677 (2023).

Antol, S. et al. Vqa: Visual question answering. In Proceedings of the IEEE International Conference on Computer Vision 2425–2433 (2015).

Li, S. et al. Proactive human-robot collaboration: Mutual-cognitive, predictable, and self-organizing perspectives. Robot. Comput.-Integr. Manuf. 81 , 102510. https://doi.org/10.1016/j.rcim.2022.102510 (2023).

Fan, J., Zheng, P. & Li, S. Vision-based holistic scene understanding towards proactive human–robot collaboration. Robot. Comput.-Integr. Manuf. 75 , 102304 (2022).

Wang, P., Liu, H., Wang, L. & Gao, R. X. Deep learning-based human motion recognition for predictive context-aware human–robot collaboration. CIRP Ann. 67 , 17–20 (2018).

Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25 , 84–90 (2012).

Sabater, A., Alonso, I., Montesano, L. & Murillo, A. C. Domain and view-point agnostic hand action recognition. IEEE Robot. Autom. Lett. 6 , 7823–7830 (2021).

Dreher, C. R., Wächter, M. & Asfour, T. Learning object-action relations from bimanual human demonstration using graph networks. IEEE Robot. Autom. Lett. 5 , 187–194 (2019).

Ramirez-Amaro, K., Beetz, M. & Cheng, G. Automatic segmentation and recognition of human activities from observation based on semantic reasoning. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems 5043–5048 (IEEE, 2014).

Merlo, E., Lagomarsino, M., Lamon, E. & Ajoudani, A. Automatic interaction and activity recognition from videos of human manual demonstrations with application to anomaly detection. In 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 1188–1195 (IEEE, 2023).

Zheng, P., Li, S., Xia, L., Wang, L. & Nassehi, A. A visual reasoning-based approach for mutual-cognitive human–robot collaboration. CIRP Ann. 71 , 377–380 (2022).

Diehl, M., Paxton, C. & Ramirez-Amaro, K. Automated generation of robotic planning domains from observations. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 6732–6738 (IEEE, 2021).

Shirai, K. et al. Vision-language interpreter for robot task planning. In 2024 IEEE International Conference on Robotics and Automation (ICRA) 2051–2058 (IEEE, 2024).

Lee, M.-L., Behdad, S., Liang, X. & Zheng, M. Task allocation and planning for product disassembly with human–robot collaboration. Robot. Comput.-Integr. Manuf. 76 , 102306 (2022).

Yu, T., Huang, J. & Chang, Q. Optimizing task scheduling in human–robot collaboration with deep multi-agent reinforcement learning. J. Manuf. Syst. 60 , 487–499 (2021).

Billard, A., Calinon, S., Dillmann, R. & Schaal, S. Survey: Robot Programming by Demonstration 1371–1394 (Springer, 2008).

Lagomarsino, M. et al. Maximising coefficiency of human–robot handovers through reinforcement learning. IEEE Robot. Autom. Lett. 8 , 4378–4385 (2023).

Lagomarsino, M., Lorenzini, M., De Momi, E. & Ajoudani, A. Robot trajectory adaptation to optimise the trade-off between human cognitive ergonomics and workplace productivity in collaborative tasks. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 663–669 (IEEE, 2022).

Ghadirzadeh, A. et al. Human-centered collaborative robots with deep reinforcement learning. IEEE Robot. Autom. Lett. 6 , 566–571 (2020).

Brohan, A. et al. Rt-1: Robotics transformer for real-world control at scale. Preprint at http://arxiv.org/abs/2212.06817 (2022).

Gu, Y., Han, X., Liu, Z. & Huang, M. Ppt: Pre-trained prompt tuning for few-shot learning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 8410–8423 (2022).

Zhang, S. et al. Instruction tuning for large language models: A survey. Preprint at http://arxiv.org/abs/2308.10792 (2023).

Wei, J. et al. Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural. Inf. Process. Syst. 35 , 24824–24837 (2022).

Brown, T. et al. Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33 , 1877–1901 (2020).

Qin, X. et al. U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn. 106 , 107404 (2020).

Kokotinis, G., Michalos, G., Arkouli, Z. & Makris, S. On the quantification of human–robot collaboration quality. Int. J. Comput. Integr. Manuf. 36 , 1431–1448 (2023).

Zhang, Y. et al. Skeleton-rgb integrated highly similar human action prediction in human–robot collaborative assembly. Robot. Comput.-Integr. Manuf. 86 , 102659 (2024).

Gustavsson, P., Syberfeldt, A., Brewster, R. & Wang, L. Human–robot collaboration demonstrator combining speech recognition and haptic control. Procedia CIRP 63 , 396–401 (2017).

Xie, J., Liu, Y., Wang, X., Fang, S. & Liu, S. A new xr-based human-robot collaboration assembly system based on industrial metaverse. J. Manuf. Syst. 74 , 949–964 (2024).

Cao, H.-L. et al. Designing interaction interface for supportive human–robot collaboration: A co-creation study involving factory employees. Comput. Ind. Eng. 192 , 110208 (2024).

Download references

Acknowledgements

This is a part of research supported by National Natural Science Foundation of China (52305539), the Natural Science Foundation of Jiangsu Province (BK20230880), the China Postdoctoral Science Foundation (2023M731661), the Fundamental Research Funds for the Central Universities (NS2024033) and the Aeronautical Science Foundation of China (2023Z074052007).

Author information

Authors and affiliations.

College of Mechanical and Electrical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, People’s Republic of China

Yuchen Ji, Zequn Zhang, Dunbing Tang, Changchun Liu & Zhen Zhao

College of Biophotonics, South China Normal University, Guangzhou, 510631, People’s Republic of China

Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, People’s Republic of China

You can also search for this author in PubMed   Google Scholar

Contributions

Y.J. methodology, modeling, writing-original draft. Z.Q.Z. supervision, software, data acquisition. D.T. conceptualization, project administration. C.L. modeling, conceptualization, supervision. Y.Z., Z.Z., X.L. writing-review & editing.

Corresponding author

Correspondence to Zequn Zhang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Consent to publish

The authors affirm that human research participants provided informed consent for publication of the images in Figs. 8 and 14 .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Ji, Y., Zhang, Z., Tang, D. et al. Foundation models assist in human–robot collaboration assembly. Sci Rep 14 , 24828 (2024). https://doi.org/10.1038/s41598-024-75715-4

Download citation

Received : 06 February 2024

Accepted : 08 October 2024

Published : 22 October 2024

DOI : https://doi.org/10.1038/s41598-024-75715-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Human–robot collaboration
  • Foundation models
  • Large language models
  • Vision foundation models
  • Intelligent manufacture

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research paper in database

  • DOI: 10.61132/keat.v1i3.402
  • Corpus ID: 272292265

Pengaruh Pertumbuhan Ekonomi Daerah, Pendapatan Asli Daerah (PAD), Dana Alokasi Umum (DAU), dan Dana Alokasi Khusus terhadap Belanja Daerah pada Kabupaten/Kota di Provinsi Sumatera Utara

  • Salsa Okdania Lubis , Dito Aditia Darma Nasution
  • Published in Kajian Ekonomi dan Akuntansi… 13 August 2024
  • Kajian Ekonomi dan Akuntansi Terapan

Related Papers

Showing 1 through 3 of 0 Related Papers

IMAGES

  1. Ieee papers on database management system pdf

    research paper in database

  2. (PDF) Impact of database management in modern world

    research paper in database

  3. PPT

    research paper in database

  4. The History of Relational Database Technology

    research paper in database

  5. 18 Printable research paper example mla Forms and Templates

    research paper in database

  6. Online Databases for Research: Guide to the Best Free & Paid DBs

    research paper in database

VIDEO

  1. Data Mining

  2. Bsc 3rd Sem Paper-Database Management System (DBMS) #youtube #2023 #computer #gjust #hisar #exam

  3. Multimedia Databases

  4. GCE A/L ICT

  5. Temporal Database

  6. GCUF Practical Past Paper

COMMENTS

  1. JSTOR Home

    Part of UN Secretary-General Papers: Ban Ki-moon (2007-2016) Part of Perspectives on Terrorism, Vol. 12, No. 4 ... (Mar. 1, 2023) Part of UCL Press. Broaden your research with images and primary sources Broaden your research with images and primary sources. Harness the power of visual materials—explore more than 3 million images now on JSTOR.

  2. Google Scholar

    Google Scholar provides a simple way to broadly search for scholarly literature. Search across a wide variety of disciplines and sources: articles, theses, books, abstracts and court opinions.

  3. 19024 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DATABASE MANAGEMENT SYSTEMS. Find methods information, sources, references or conduct a literature ...

  4. Database Search

    A catalog to find the specialized search engine that has what you need—identifying and connecting to the best databases for your research topic. What is Database Search? Harvard Library licenses hundreds of online databases, giving you access to academic and news articles, books, journals, primary sources, streaming media, and much more.

  5. Home :: SSRN

    SSRN provides 1,480,782 preprints and research papers from 1,916,469 researchers in over 65 disciplines.

  6. Search

    Find the research you need | With 160+ million publication pages, 1+ million questions, and 25+ million researchers, this is where everyone can access science

  7. The best academic research databases [Update 2024]

    Whether you are writing a thesis, dissertation, or research paper it is a key task to survey prior literature and research findings. More likely than not, you will be looking for trusted resources, most likely peer-reviewed research articles. Academic research databases make it easy to locate the literature you are looking for.

  8. PubMed

    PubMed® comprises more than 37 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full text content from PubMed Central and publisher web sites.

  9. ResearchGate

    Access 160+ million publications and connect with 25+ million researchers. Join for free and gain visibility by uploading your research.

  10. Web of Science Master Journal List

    Browse, search, and explore journals indexed in the Web of Science. The Master Journal List is an invaluable tool to help you to find the right journal for your needs across multiple indices hosted on the Web of Science platform. Spanning all disciplines and regions, Web of Science Core Collection is at the heart of the Web of Science platform. Curated with care by an expert team of in-house ...

  11. ScienceDirect.com

    Elsevier journals offer the latest peer-reviewed research papers on climate change, biodiversity, renewable energy and other topics addressing our planet's climate emergency. Join us in working towards a sustainable future with our editorially independent report on creating a Net Zero future. Get the Net Zero report

  12. APA PsycInfo

    With more than 5,000,000 interdisciplinary bibliographic records, our database delivers targeted discovery of credible and comprehensive research across the full spectrum of behavioral and social sciences. This indispensable resource continues to enhance the discovery and usage of essential psychological research to support students, scientists ...

  13. Advances in database systems education: Methods, tools, curricula, and

    In order to preserve the principal aim of this study, which is to review the research conducted in the area of database systems education, a piece of advice has been collected from existing methods described in various studies (Elberzhager et al., 2012; Keele et al., 2007; Mushtaq et al., 2017) to search for the relevant papers. Thus, proper ...

  14. Research Guides: Writing a Research Paper: Databases

    On the databases page you will find more than 300 different databases. To determine which one is best for your project, you can use the menu in the top left corner to search for databases by Subject. For example, if you are doing research for Sociology, you might want to look in databases just for Sociology.

  15. The best academic search engines [Update 2024]

    Academic search engines have become the number one resource to turn to in order to find research papers and other scholarly sources. While classic academic databases like Web of Science and Scopus are locked behind paywalls, Google Scholar and others can be accessed free of charge. In order to help you get your research done fast, we have compiled the top list of free academic search engines.

  16. Directory of Open Access Journals

    DOAJ is committed to keeping its services free of charge, including being indexed, and its data freely available. → About DOAJ. → How to apply. Apply now. DOAJ is twenty years old in 2023. Fund our 20th anniversary campaign. Funding. DOAJ is independent. All support is via donations.

  17. Databases for Science Research

    This list reflects just some of the science databases available to researchers from the Smithsonian Libraries and Archives. For a complete list of subscription and vetted databases go to E-journals, E-books, and Databases.For more subject-specific resources see our Science Research Guides.. Databases that require SI network for access are indicated by "SI staff."

  18. Academic research: how to search online databases [8 steps ...

    You might prefer the search system of one database over another based on the results you get from keyword searches. One database might have more advanced search options than the other. You can also try a more general database like: Web Of Science; Scopus; Dimensions; ️ Visit our list of the best academic research databases. 8.

  19. 10 Free Research and Journal Databases

    7. Social Science Research Network (SSRN) SSRN is a database for research from the social sciences and humanities, including 846,589 research papers from 426,107 researchers across 30 disciplines. Most of these are available for free, although you may need to sign up as a member (also free) to access some services.

  20. 23 Research Databases for Professional and Academic Use

    Discover the benefits using a research database can offer and review a list of 23 databases you can consult when gathering information for a research project. ... This can help save you time and ensure you accurately cite the resource if you use it in a research paper or presentation. Databases often offer citations for various formats too ...

  21. Qualitative Research: Getting Started

    Quantitative research can be defined as "the means for testing objective theories by examining the relationship among variables which in turn can be measured so that numbered data can be analyzed using statistical procedures". 1 Pharmacists may have used such methods to carry out audits or surveys within their own practice settings; if so ...

  22. LibGuides: Computer Science: Articles, Journals, & Databases

    This database includes research findings on a wide range of IT-related topics, including analyses, opinions, trends, leading practices, and case studies. Science Journals Database (ProQuest) This link opens in a new window. Coverage dates back to 1986 and featuring over 1,145 titles—with more than 965 available in full text. In full-text ...

  23. Interview Paper: Thinking Through an Interview Guide

    In other words, this paper represents your ability to integrate data, insight, and reflection. NOTE: Because this entire research project builds upon previous classwork, you may be able to use some lines of text from the first paper in this one. PART 1: Construct an Interview Guide

  24. Large Language Models in Qualitative Research: Can We Do the Data

    Given the importance of qualitative methods to human-computer interaction, we use the tensions surfaced by our participants to outline guidelines for researchers considering using LLMs in qualitative research and design principles for LLM-assisted qualitative data analysis tools. PDF Abstract

  25. Foundation models assist in human-robot collaboration assembly

    This is a part of research supported by National Natural Science Foundation of China (52305539), the Natural Science Foundation of Jiangsu Province (BK20230880), the China Postdoctoral Science ...

  26. Pengaruh Pertumbuhan Ekonomi Daerah, Pendapatan Asli Daerah (PAD), Dana

    This research aims to examine the influence of regional economic growth, original regional income, general allocation funds, special allocation funds on regional expenditure. The population in this study used 33 regencies/cities in North Sumatra with observations for 4 years from 2018 - 2021. The approach used in this research was an associative approach and the data analysis technique used ...

  27. CoNEXT 2024 Accepted Papers

    CoNEXT 2024 Accepted Papers - conferences2.sigcomm.org