Tutorials

Read all latest tutorial posts

WebSweep: Collecting Website Text for Research
March 24, 2026

WebSweep: Collecting Website Text for Research

WebSweep helps researchers capture what was publicly visible on a given date, preserve the raw HTML as a reproducible archive, and turn those pages into analysis-ready text.

Use WebSweep when you:

  • have a list of public websites or domains
  • want a repeatable workflow for many domains
  • mainly need HTML text and metadata from public pages

In this tutorial, we use the example of FIRMBACKBONE. It is the Dutch research infrastructure to provides secure, FAIR access to comprehensive data on all registered organizations in the Netherlands, including web-based data. We would like to collect information of corporate websites, for example to track the scope and depth of coverage of the energy transition. The same workflow can be used for universities, NGOs, government organisations, local news sites, project websites, or any other public set of domains.

read more
How to share your research code
September 5, 2022

How to share your research code

What are the best ways to create an understandable, openly accessible, findable, citable, and stable archive of your code?

read more