Sharing analysis code & software

Sharing analysis code and software is a key open research practice that promotes transparency, reproducibility, and collaboration. By making your scripts, programs, or workflows publicly available, you enable others to reproduce your analyses, verify your findings, and adapt your methods for new research.

Upload your code to trusted repositories such as GitHubGitLab, Keele Data Repository, or the Open Science Framework (OSF). Include clear documentation that explains the code’s purpose, structure, inputs, and outputs, as well as any dependencies or software version details. Well-organised, readable code with meaningful comments and consistent formatting greatly improves usability.

To ensure long-term accessibility and citation, link your repository to an archival service such as Zenodo to generate a Digital Object Identifier (DOI). Apply an open-source licence, for example from the Open Source Initiative, to specify reuse permissions.

This practice not only strengthens trust in your research but also supports cumulative scientific progress by allowing others to build on your methods. As open science principles continue to grow, many journals, funding bodies, and the Research Excellence Framework (REF) now encourage or require the sharing of analysis code, making it an essential component of credible and responsible quantitative research.

No Action (Quant.)

Analysis code and software are not shared. Analyses are conducted using methods or scripts that are undocumented, inaccessible to others, and cannot be independently verified or reproduced.

Moving from No Action to Emerging in Sharing Analysis Code/Software (Quant.)

  1. To progress from No Action to Emerging, start by improving the clarity, structure, and traceability of your analysis code. These initial steps will help make your work more understandable to others and easier for you to reuse in the future.
  2. Annotate Your Code. Add brief comments throughout your scripts explaining what each section does. Use consistent naming conventions for variables, functions, and files. Organise complex scripts into clear sections such as data cleaning, analysis, and visualisation.
  3. Record your software environment. Document tools, packages, and software versions used in your analysis. In R, you can use sessionInfo(); in Python, try pip freeze. Note any custom templates, macros, or plugins that are required to reproduce your work.
  4. Create Basic Documentation. Write a simple README file that explains:
    • The purpose of the analysis
    • How your materials are structured
    • What each script or file does
    • How to run the code
    • What inputs are needed and what outputs are produced
  5. Choose a Platform for Sharing. Create an account on a trusted platform such as GitHubGitLab, the Open Science Framework (OSF), or your university repository. Begin exploring how to upload your files, even privately at first, to familiarise yourself with the sharing process.
  6. Check Ethical and Intellectual Property Considerations. Review any funder, institutional, or data protection policies that might affect what you can share. If you are unsure, seek advice from your research office or ethics board before making materials public.

Emerging (Quant.)

Some analysis code is shared upon request, but sharing remains informal and inconsistent. Documentation is limited, and the code cannot be easily understood or reproduced by others.

Moving from Emerging to Evolving in Sharing Analysis Code/Software (Quant.)

  1. To progress from Emerging to Evolving, focus on making your analysis code organised, documented, and accessible so others can understand and reproduce your work more easily.
  2. Add a README File. Create a clear README file to accompany your code. Describe:
    • The purpose of the analysis
    • The structure of your files and scripts
    • Required software and tools
    • How to run the code
    • The expected outputs (for example, tables, graphs, or model summaries). A good README helps others navigate your materials and ensures your future self can reproduce your analyses later.
  3. Record Your Software Environment. Document all dependencies, software versions, and packages. Tools such as sessionInfo() (R), pip freeze (Python), or conda env export (Conda) can generate this automatically. Include these details in your README or as a separate file (for example, requirements.txt or environment.yml).
  4. Upload to a Trusted Platform. Choose a reliable platform to store and share your materials:
    • GitHub or GitLab for version-controlled code
    • Open Science Framework (OSF) for complete project structures
    • Zenodo for long-term archiving and DOI generation. Upload your cleaned and annotated scripts to the platform, ensuring that others can access and cite them easily.
  5. Address Ethical and Intellectual Property Considerations. Before sharing, review whether your code or data contain sensitive information or proprietary elements. Remove identifying details, credentials, or confidential data paths. If your analysis uses licensed software or third-party scripts, check the reuse conditions and cite them appropriately.

Evolving (Quant.)

Analysis code and software are shared in trusted public repositories with basic documentation, open licensing, and initial efforts to make analyses understandable and reproducible by others.

Moving from Evolving to Sustained in Sharing Analysis Code/Software (Quant.)

  1. To move from Evolving to Sustained, focus on enhancing the transparency, usability, and long-term accessibility of your analysis materials.
  2. Refine Documentation and Structure. Ensure your scripts and workflows (for example, in R or Python) are well organised, clearly commented, and accompanied by a comprehensive README file. The README should describe the purpose of each script, the analytical steps, expected outputs, and any dependencies or software versions required.
  3. Implement Version Control and Reproducibility. Use version control systems such as Git to track updates and maintain transparency across project stages. Record your software environment using tools such as sessionInfo() (R), pip freeze (Python), or conda env export to ensure others can reproduce your work exactly.
  4. Share through trusted repositories. Deposit your code and documentation in repositories suited to long-term access and citation, such as GitHubOpen Science Framework (OSF), or Zenodo. Where possible, link repositories (for example, GitHub to Zenodo) to generate a Digital Object Identifier (DOI) for citation and version tracking.
  5. Clarify Licensing and Ethical Practices. Include an open-source licence such as MIT or CC BY 4.0 to define reuse permissions clearly. Review your materials to ensure no sensitive or proprietary content is included, and anonymise any data if applicable.
  6. Promote Reuse and Integration. Reference your shared code and DOIs in publications, presentations, and teaching materials. Encourage reuse by providing example data or tutorials that demonstrate how to run your analyses. Track downloads, forks, and citations to understand your impact and improve your materials over time.

Sustained (Quant.)

Analysis code and software are consistently shared through trusted platforms such as GitHub or the Open Science Framework (OSF). Materials include comprehensive documentation, full version control, open-source licensing, and sufficient metadata to ensure transparency, reproducibility, and long-term reusability.

Guidance for Sustained Level in Sharing Analysis Code/Software (Quant.)

Congratulations. You are operating beyond good practice and contributing as a field leader in open and reproducible quantitative research. At this stage, analysis code and software are consistently shared through trusted, openly accessible platforms such as GitHubGitLab, the Open Science Framework (OSF), or Zenodo. All materials are clearly structured, version-controlled, and accompanied by comprehensive documentation. To maintain and further improve at this level, consider the following:

  • README: A concise and clear overview of the project, including its purpose, inputs, outputs, and step-by-step instructions for reproducing analyses.
  • Annotations: Detailed comments throughout the code explaining key stages such as data cleaning, transformation, statistical modelling, and visualisation.
  • Software Environment: Files such as requirements.txtenvironment.yml, or sessionInfo() outputs that fully specify software dependencies to ensure reproducibility.
  • Licence: An open-source licence, for example MIT or CC BY 4.0, that clearly defines how others may reuse or adapt the code.
  • Version History: A changelog or other version-tracking system that allows users to access earlier versions and reproduce prior analyses when needed.
  • Ethics and Data: Sensitive or proprietary data removed or anonymised, with clear guidance on what cannot be shared and how data access can be requested where appropriate.