Pagure Exporter v0.1.4 Released

In the first half of 2025, communities like those of Fedora Project, CentOS Project and OpenSUSE Project were migrating their projects away from Pagure. The v0.1.4 release of Pagure Exporter helps significantly with the heavylifting involved and this article covers my experiences as its architect.

Pagure Exporter v0.1.4 Released
Photo by Chris Ried / Unsplash

The first and second quarters of 2025 was the time when a bunch of free and open source software communities seemed to be actively moving away from Pagure to either GitLab (in case of CentOS Project and OpenSUSE Project) and Forgejo (in case of Fedora Project). Having written Pagure Exporter about a couple of years back and being deeply involved in the Fedora To Forgejo initiative, I found myself in the middle of all the Git Forge migration craziness. With a bunch of feature requests and feature requests reaching the doors of the project, I wanted to make the best use of my time to deliver the first release of 2025 for Pagure Exporter using the effective workflows and community personnel at my disposal. I would cover my experiences with the efforts in making this release possible in this article.

Homepage of Pagure Exporter - https://github.com/fedora-infra/pagure-exporter

Impressions

Contributing to a hustling and bustling free and open source software community like those of Fedora Project and CentOS Project means that there are always some tasks required to completed soon. Thankfully, there are also a bunch of passionate contributors willing to roll up their sleeves and hit the ground running as long as they are aware of it. While I was sometimes affected by the unreliability of certain software libraries and the intermittent AI scraper attack on Pagure, I was also joined by the likes of Greg Sutcliffe, Fabian Arrotin, Yashwanth Rathakrishnan, Shounak Dey, Peter Olamide in the efforts. Furthermore, I made it a point to use assistive artificial intelligence technologies for purposes like explaining extended logs and generating code inspirations to kick things off from, at my discretion.

CentOS Git Server migration to GitLab by Davide Cavalca - https://pagure.io/centos-infra/issue/1654

Apes (Are) Strong Together

The request for working on extending Pagure Exporter to support various other hostnames (like those of Fedora Dist Git and CentOS Git Server) was scoped first at around January 2025. With me occupied with the Fedora To Forgejo migration efforts, it was only until March 2025 when the work on it was started by an Outreachy applicant, Rajesh Patel. As the request had an increase in priority by April 2025, I decided to briefly context switch from my existing work to implement the support for different Pagure hostnames. While this was reviewed positively by Michal Konecny and Aurelien Bompard, the readability of the introduced codebase itself was in question so that had to be resolved separately and by someone else, to ensure that I do not end up introducing code changes that only I could understand.

Wrapper to check / create projects on GitLab using the REST API by Greg Sutcliffe - https://pagure.io/centos-infra/issue/1658

Leading up to the v0.1.4 release of Pagure Exporter, I was helped by Greg who himself explored the GitLab API to build a simple Python script that automatically created projects on GitLab under a certain namespace. Pagure Exporter was expected to work in tandem with the said script to migrate repository contents and issue tickets from Pagure as soon as the projects are created on GitLab. We also discussed the possibility of offloading the migration to the GitLab infrastructure to minimize potential network hiccups during the transfer process. Davide Cavalca also joined in to help tailor fit the approach of the migration proceedings and Fabian imported the CentOS Board and CentOS Infra namespaces as dry runs while making observations as to how the tool can be used at scale in automation.

Create a repo and FAS group for FRCL by Fabian Arrotin - https://pagure.io/centos-infra/issue/1709

Gifted With Zealous Mentees

While Rajesh's work could not be merged, I did appreciate the effort that he put into understanding the project and I hoped that I was able to provide learnings. Just like him, we had another enthusiastic Outreachy applicant, Peter who helped in fixing the deprecation status of the datetime library. Yashwanth helped out with going around the codebase to update the copyright years across the code headers. The one contributor who was immensely helpful was Shounak who assisted in moving from using absolute imports to relative ones and in renaming identifiers for improved readability, thus addressing the previously stated concerns. Finding external contributors was difficult due to the challenges we faced with the VCR.py library failing inexplicably but amazing mentees use this as a learning opportunity.

Fix the deprecation status of the datetime library usage by Peter Olamide - https://github.com/fedora-infra/pagure-exporter/pull/157

Patience probably is one of the most defining characteristics for those working on free and open source projects. While I try to keep my turnaround time under a week to address any open issue tickets or pull requests as evidenced by those under the v0.1.4 release, sometimes it could take months to get back to a certain work as evidenced by the codebase changes for improving readability. As I have been taking on more work after my promotion to Senior Software Engineer, I have also begun to include open source artificial intelligence tooling like Ramalama, Ollama and Cursor in my workflow for reviewing external codebase changes and finding alternative performance optimizations - all to ensure that the quality of my work remains high while I context switch from one task to another in momentum.

Rename identifiers for improved readability by Shounak Dey - https://github.com/fedora-infra/pagure-exporter/pull/191

The AI Scraper Attack

While I wrote about how including open source artificial intelligence technologies in my workflow was helpful in making me productive in the previous section, this section is more about how external AI scrapers hindered the progress of the v0.1.4 release of Pagure Exporter. Pagure has been receiving unreasonable amounts of traffic from various AI scrapers for a while now, but things seemed to worsen at the second half of June 2025 when the bombardment of millions of heavy requests led to the service becoming inaccessible to legitimate users. As the project relied on making actual HTTPS Git requests (but masqueraded HTTPS REST requests) for testing purposes, we could not reliably verify the correctness of the codebase changes, thus negatively affecting the initiative of moving CentOS repos to GitLab.

Trigger CI to run on push or pull_request towards main by Shounak Dey - https://github.com/fedora-infra/pagure-exporter/pull/210

Even though I run a bunch of selfhosted applications and services on my homelab infrastructure, I am by no means a system administrator, so I had to rely on Kevin Fenzi to block out the offending IP addresses. I have had fair share of problems from AI scrapers on my testing deployment of Forgejo that I had to keep it behind the Cloudflare verification so I understood just how difficult it must have been for him to keep the unreasonable requestors at bay. Learning from the deployment of Codeberg, I have been looking into Anubis to understand just how we can leverage it to protect the upstream resources from the AI scrapers. Given that the Fedora Infrastructure was undergoing a datacenter move as of the first week of July 2025, the experimentation (or implementation) of this solution has to wait for later.

Fedora Infrastructure status page as of 02nd July 2025 - https://status.fedoraproject.org/

Unreliable Libraries For Testing

Imagine something pissing me off so much that I had to write about my experience with that in its own dedicated section! I want to preface the section by saying that for whatever trouble VCR.py had given me since the beginning of 2025, it had been immensely helpful in ensuring that I do not have to make a bunch of requests to an actual server. For some reason, the tests involving VCR.py used to work just fine during development but fail inexplicably on GitHub Actions - and error messages would be of no help especially when they are related to failing matchers, existing cassettes, non-existent cassettes, count mismatch etc. There happened to be a bunch of pull requests lined up to address to mentioned concerns, but they were not actively looked into - so I decided that it was about time for me to move away.

Move away from VCR.py to responses by Akashdeep Dhar - https://github.com/fedora-infra/pagure-exporter/pull/200

And move away I did - to Responses. It was more than methodology switch though as it included a shift in philosophy as unlike VCR.py which used to record real HTTP requests and replay them, Responses mocks the HTTP call entirely. With the increasing roster of over 90 testcases that ensured a stellar 100% codebase coverage, converting the cassettes to Responses would have been a chore. In came my trustworthy AMD Radeon RX6800XT and Ramalama to rescue, I was able to parse through the VCR.py cassettes to obtain Response Definition objects during the testing runtime. The solution was great, even if I say so myself, as I saved approximately ten to fifteen hours of trudging along (and of course, boredom) to painstakingly port the associated recordings to the respective HTTP testcases.

List of pull requests under kevin1024/vcrpy as of 02nd July 2025 - https://github.com/kevin1024/vcrpy/pulls

Changelog

Published on PyPI - Pagure Exporter v0.1.4
Published on Fedora Linux - Pagure Exporter v0.1.4
Published on GitHub - Pagure Exporter v0.1.4

GitHub release of Pagure Exporter v0.1.4 - https://github.com/fedora-infra/pagure-exporter/releases/tag/0.1.4

From maintainers

  • Fixed the deprecation status of the datetime library usage
  • Tailor fitted the filters to remove credentials before recordings are stored locally
  • Updated the Packit configuration to satiate Packit v1.0.0 release
  • Moved away from using absolute imports to using relative imports
  • Introduced support for CentOS Git Server (i.e. https://git.centos.org)
  • Introduced support for Fedora Dist Git (i.e. https://src.fedoraproject.org)
  • Introduced support for different custom Pagure hostnames
  • Updated copyright headers across all the codebase headers
  • Renamed the identifiers for improved codebase readability
  • Moved away from VCR.py to Responses for test caching purposes
  • Made various automated dependency and security updates
  • Marked the first release of Pagure Exporter in 2025

From GitHub

  • Automated dependency updates by @renovate in #90
  • Automated dependency updates by @renovate in #91
  • Attempt to not mess up the repository secrets by @gridhead in #155
  • Fix the deprecation status of the datetime library usage by @olamidepeterojo in #157
  • Update dependency black to v25 by @renovate in #159
  • Update dependency ruff to ^0.0.285 || ^0.1.0 || ^0.2.0 || ^0.3.0 || ^0.4.0 || ^0.5.0 || ^0.6.0 || ^0.9.0 by @renovate in #146
  • Update dependency vcrpy to v7 by @renovate in #150
  • Automated dependency updates by @renovate in #149
  • Update Packit config after Packit v1.0.0 release by @gridhead in #160
  • Automated dependency updates by @renovate in #161
  • Automated dependency updates by @renovate in #162
  • Automated dependency updates by @renovate in #169
  • Automated dependency updates by @renovate in #170
  • Automated dependency updates by @renovate in #171
  • Update dependency ruff to ^0.0.285 || ^0.1.0 || ^0.2.0 || ^0.3.0 || ^0.4.0 || ^0.5.0 || ^0.6.0 || ^0.9.0 || ^0.10.0 by @renovate in #173
  • Update dependency ruff to ^0.0.285 || ^0.1.0 || ^0.2.0 || ^0.3.0 || ^0.4.0 || ^0.5.0 || ^0.6.0 || ^0.9.0 || ^0.10.0 || ^0.11.0 by @renovate in #174
  • Update dependency pytest-cov to v6 by @renovate in #147
  • Automated dependency updates by @renovate in #175
  • Automated dependency updates by @renovate in #176
  • Automated dependency updates by @renovate in #177
  • Automated dependency updates by @renovate in #178
  • Automated dependency updates by @renovate in #179
  • Move from using relative imports instead of absolute imports by @sdglitched in #185
  • chore: updated copyright years across all the codebase headers by @iamyaash in #183
  • chore(deps): automated dependency updates by @renovate in #189
  • Introduce support for different Pagure hostnames by @gridhead in #188
  • chore(deps): automated dependency updates by @renovate in #192
  • chore(deps): automated dependency updates by @renovate in #193
  • chore(deps): automated dependency updates by @renovate in #194
  • fix(deps): update dependency requests to v2.32.4 [security] by @renovate in #195
  • chore(deps): automated dependency updates by @renovate in #196
  • Rename identifiers for improved readability by @sdglitched in #191
  • Move away from VCR.py to Responses by @gridhead in #200
  • chore(deps): update dependency ruff to ^0.0.285 || ^0.1.0 || ^0.2.0 || ^0.3.0 || ^0.4.0 || ^0.5.0 || ^0.6.0 || ^0.9.0 || ^0.10.0 || ^0.11.0 || ^0.12.0 by @renovate in #197
  • chore(deps): automated dependency updates by @renovate in #198
  • Version bump from v0.1.3 to v0.1.4 by @gridhead in #202

New contributors