Frequently Asked Questions (FAQ)
General Questions
What is the Literature Review Tool?
The Literature Review Tool is a Python package that automates literature searches across multiple academic databases using the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) methodology. It helps researchers systematically search, filter, and document literature reviews.
Who should use this tool?
This tool is designed for:
- Researchers conducting systematic literature reviews
- Graduate students working on thesis literature reviews
- Academics preparing meta-analyses
- Anyone needing to search multiple databases systematically
Is this tool free to use?
Yes, the tool itself is free and open-source (GPL-3.0 license). However, some databases require API keys:
- Free: PubMed, CrossRef, arXiv, CORE (free API key), SemanticScholar, DBLP
- Paid/Institutional: IEEE, Springer, Scopus
Installation and Setup
What are the system requirements?
- Ubuntu 24.04.2 LTS (or compatible Linux distribution)
- Python 3.9 or higher
- Dependencies:
requests,drawpyo,matplotlib,pytest
Can I use this on Windows or macOS?
The tool is developed for Linux (Ubuntu), but should work on:
- macOS: Generally compatible
- Windows: Should work with Python 3.9+, may need adjustments for file paths
How do I install the tool?
Simply run:
pip install literature_search
See the Installation Guide for details.
I’m getting “permission denied” errors during installation. What should I do?
Try installing with user permissions:
pip install --user literature_search
Or use a virtual environment (recommended):
python -m venv venv
source venv/bin/activate
pip install literature_search
Configuration
Do I need to provide keywords?
Keywords are optional. If you don’t provide them, the tool will auto-generate keywords from your research_topic field with a warning message.
Recommended: Provide your own carefully chosen keywords for best results.
What’s the difference between inclusion and exclusion criteria?
- Inclusion criteria: Publications matching ANY of these will be INCLUDED
- Exclusion criteria: Publications matching ANY of these will be EXCLUDED (even if they match inclusion criteria)
Exclusion criteria take precedence over inclusion criteria.
What is the field:value syntax?
The field:value syntax allows precise filtering:
"inclusion_criteria": [
"type:journal-article",
"language:english",
"journal:nature"
]
This is more precise than simple keywords. See Configuration Guide for details.
Can I use multiple databases at once?
Yes! Just list them in your configuration:
"databases": ["PubMed", "CrossRef", "arXiv", "IEEE", "Springer"]
The tool will search all specified databases sequentially.
API Keys
Which databases require API keys?
- Required: CORE, IEEE, Springer, Scopus
- Not required: PubMed, CrossRef, arXiv, SemanticScholar, DBLP
How do I get API keys?
See the Databases Guide for detailed instructions for each database.
What happens if I don’t provide an API key for a required database?
The tool will skip that database with a warning message. Other databases will still be searched.
Is it safe to put API keys in the configuration file?
For development: Yes, but never commit configuration files with API keys to version control.
Best practices:
- Add
*_config.jsonto.gitignore - Use environment variables for production
- Keep API keys secure and private
My API key isn’t working. What should I check?
- Verify the key is correct (copy-paste errors are common)
- Check the key hasn’t expired
- Ensure you’re using the correct database name in the configuration
- Verify your API key has proper permissions
Usage
How many results can I get per database?
Use the --page_size parameter to control this:
literature-search --config sample_input.json --page_size 100
Recommendations:
- Start with 50-100 for testing
- Use 100-500 for production searches
- Be mindful of API rate limits
What does the --logic parameter do?
The --logic parameter controls how keywords are combined:
- OR (default): Returns publications matching ANY keyword
- AND: Returns publications matching ALL keywords
literature-search --config sample_input.json --logic AND
Can I search a specific date range?
Yes, in your configuration file:
"date_range": "2015-2025"
This filters publications to only those within the specified years.
Why are my searches returning no results?
Common reasons:
- Keywords too specific: Try broader terms
- Date range too narrow: Expand the date range
- Criteria too restrictive: Review inclusion/exclusion criteria
- Wrong database: Some databases specialize in specific domains
- API key issues: Check if databases are being skipped
Output and Results
What files are generated?
After a search, you’ll find:
all_publications_found.csv- All publications with statusselected_publications.csv- Only included publicationsoutput_results.csv- Backward compatibility fileresults.json- Complete results in JSONprisma_flow_diagram_filled.drawio- PRISMA flow diagram
What’s the difference between the CSV files?
- all_publications_found.csv: Complete list with inclusion/exclusion reasons
- selected_publications.csv: Only publications that passed all criteria
- output_results.csv: Legacy format, kept for backward compatibility
Use selected_publications.csv for your literature review.
How do I open the PRISMA flow diagram?
The .drawio file can be opened with:
- draw.io (web-based, free)
- diagrams.net (same as draw.io)
- Draw.io desktop application
Can I customize the PRISMA diagram template?
Yes! Place your own prisma_flow_diagram.drawio file in the output directory before running the search. The tool will use your template and fill it with data.
Why don’t I see abstracts for all publications?
Not all databases provide abstracts:
- Include abstracts: PubMed, SemanticScholar, Springer
- Limited abstracts: CrossRef, IEEE, Scopus
- No abstracts: arXiv (uses summary instead), DBLP
Troubleshooting
“ERROR: Could not read configuration file”
- Check the file path is correct
- Verify JSON syntax (use a JSON validator)
- Ensure required fields are present
- Check file permissions
“Rate limit exceeded” error
Some databases have rate limits:
- Wait a few minutes before retrying
- Reduce
--page_sizeparameter - Spread searches over time
- Check API documentation for limits
The tool is hanging or running very slowly
Possible causes:
- Large page size: Reduce
--page_size - Network issues: Check internet connection
- API timeouts: The tool retries automatically
- Multiple databases: Searching many databases takes time
I’m getting “Database not found” errors
Check the spelling of database names in your configuration:
- Correct:
"PubMed","CrossRef","arXiv" - Incorrect:
"pubmed","Crossref","arxiv"
Database names are case-sensitive.
How do I report bugs?
- Check if it’s already reported in GitHub Issues
- Create a new issue with:
- Clear description of the problem
- Steps to reproduce
- Error messages
- Your configuration (sanitized - remove API keys)
- Environment details (OS, Python version)
PRISMA Methodology
What is PRISMA?
PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) is a framework for conducting systematic literature reviews. It ensures transparency and completeness in the review process.
Learn more: PRISMA Statement
How does this tool implement PRISMA?
The tool implements PRISMA by:
- Identification: Searching multiple databases systematically
- Screening: Applying inclusion/exclusion criteria automatically
- Eligibility: Filtering based on specified parameters
- Inclusion: Generating final list of included publications
- Documentation: Creating PRISMA flow diagrams and logs
Do I still need to do manual review?
Yes! The tool automates searching and initial filtering, but you should:
- Review abstracts of selected publications
- Access full texts for final inclusion
- Assess quality and relevance
- Document any additional manual exclusions
See prisma_guidelines.md for manual review guidance.
Can I use this for my thesis/dissertation?
Yes! However:
- Document your search methodology clearly
- Include search dates and database versions
- Report the tool version used
- Keep detailed records of your process
- Follow your institution’s requirements
How should I cite this tool in my research?
Literature Review Tool (Version X.X.X). Shanaka Abeysiriwardhana.
Available at: https://github.com/shanakaprageeth/literature_search
Also cite the specific database APIs you used.
Advanced Usage
Can I use this programmatically in my Python code?
Yes! See the Usage Guide for examples.
Can I add support for new databases?
Yes! See the Contributing Guide for instructions.
Can I integrate this with my existing workflow?
Yes! The tool provides:
- Command-line interface for scripts
- Python API for integration
- JSON output for downstream processing
- CSV output for spreadsheet analysis
How can I process results further?
The results.json file contains complete data for further processing:
import json
with open('output/results.json', 'r') as f:
data = json.load(f)
# Process data as needed
for pub in data['publications']:
# Your analysis here
pass
Performance
How long does a typical search take?
Depends on:
- Number of databases: 1-3 minutes per database
- Page size: Larger sizes take longer
- Network speed: Affects API calls
- API response times: Varies by database
Typical: 5-15 minutes for 5 databases with page_size=100
Can I speed up the search?
Optimization tips:
- Reduce number of databases
- Lower page_size for testing
- Use more specific keywords
- Narrow date range
- Run during off-peak hours (better API response)
Does the tool cache results?
No, the tool doesn’t cache API results. Each search is fresh. If you need to re-process results without searching again, use the results.json file.
Support and Community
Where can I get help?
- Documentation: Read these docs thoroughly
- GitHub Issues: Report bugs and request features
- GitHub Discussions: Ask questions and share experiences
- Source Code: Review code for implementation details
How can I contribute?
See the Contributing Guide for detailed instructions.
Is there a mailing list or forum?
Currently, we use GitHub for all community interaction:
- Issues for bugs and features
- Discussions for questions and ideas
- Pull requests for contributions
Still have questions? Open an issue on GitHub or check the documentation.