The application aims to automate the systematic review process, leveraging natural language processing (NLP), machine learning (ML), and data extraction techniques to streamline research design, literature review, and data synthesis. The web application is designed to assist researchers by automating several steps of the process, including formulating research questions, defining inclusion and exclusion criteria, searching for relevant studies, selecting studies, extracting data, and synthesizing findings. This automation is achieved through the use of NLP and ML algorithms, alongside APIs for accessing scientific literature.
The application is built using Python, leveraging libraries such as nltk for NLP tasks, pdfplumber for extracting text from PDFs, BeautifulSoup for HTML parsing, and spacy for advanced NLP processing It creates the web interface, allowing users to interactively input their research question, refine it, and receive updates on the review process, but only input one question/topic for the whole process to work with.
The application integrates with scientific literature databases through their APIs, such as the Semantic Scholar API, to automate the search and retrieval of relevant studies ML algorithms, including those from the sklearn library, are used for screening studies and synthesizing data The application streamlines the creation of systematic reviews, from formulating research questions to synthesizing findings, ready for publication.
This content block uses natural language processing to identify key terms from the user's research question and formulates inclusion and exclusion criteria for the systematic review. It filters out irrelevant words and focuses on unique keywords, excluding studies that do not meet these criteria and a specified year range.
The "Searching for Relevant Studies" content block uses the Semantic Scholar API to find studies that match the user's query and inclusion/exclusion criteria. It retrieves and logs relevant documents for further review and selection.
This section utilizes pdfplumber and BeautifulSoup to extract text from PDFs and HTML content of selected studies. The spacy NLP library is used to structure the extracted data, which includes details like titles, years, and identified entities.
This is responsible for the qualitative and quantitative synthesis of the extracted data. It uses LDA topic modeling for qualitative synthesis and a random-effects meta-analysis for quantitative synthesis. The function 'synthesize_findings_for_publication' generates a cohesive narrative that includes an introduction, methods, results, discussion, and conclusion sections based on the qualitative and quantitative results.
The "Generating Publication Sections" content block is responsible for creating the different sections of the final publication-ready document. It includes functions to generate the introduction, methods, results, discussion, and conclusion sections, each populated with relevant content based on the systematic review findings.
NextJS is included for type Headless CMS. TailwindCSS and graphql-ts-client are set up within the project.
Rename the .env.local.dist
file to .env.local
The npm run start
command starts Contember Admin and NextJS, which runs on localhost:3000
.
Before each launch, npm run codegen
will run the command to generate Typescript definition for GraphQL in the /website/api/__generated
folder.
api
- folder containsgraphQLExecutor.ts
file, which provides the call to the GraphQL endpoint.__generated
-⚠️ Do not edit this folder⚠️ . This folder contains automatically generated files for the GraphQL client. The structure should be regenerated withnpm run codegen
after changing the model and performing the migration.queries
- if we have identified one of the entities as a content entity during the generation process, you will find a file with a fully compiled query for that entity in this folder.
app
entity_name/[id]/page.tsx
- contains a basic rendering of the page structure based on the generated GraphQL call, this is the file you will want to start editing after publishing first post.
components
[RichTextRenderer.tsx](https://docs.contember.com/reference/admin/api/v1.2/Content%20rendering/RichTextRenderer/)
- component for rendering the Content entity
public
- static assetsscripts