- Merge request !105Issue #3295357: Migrate drupal.org issues to gitlab issues → (Open) created by fjgarlin
-
drumm →
committed f61d62bb on reg-prot authored by
fjgarlin →
Issue #3295357 by fjgarlin: Remove link to www.drupal.org issues
-
drumm →
committed f61d62bb on reg-prot authored by
fjgarlin →
- 🇺🇸United States dww
Coming here from a #gitlab Slack meeting thread about issue IDs. When you run
git blame
and find a commit, you see things like “Issue #12345: whatever”. It’s going to very seriously harm DX if I have to check both drupal.org/node/12345 and GitLab issue 12345 to figure out which one is actually it.According to Moshe, GitLab allows us to set the ID when creating new issues programmatically. I strongly believe we should use the d.o issue NID to specify the gitlab issue ID during the migration. Then, after all the issues are in GitLab, we know that we can always go to GitLab/$project/issues/ID and there won’t be any collisions.
- 🇪🇸Spain fjgarlin
I investigated the API possibilities and there is a bit of a conflict here.
It is possible to set the IID when creating an issue (https://docs.gitlab.com/ee/api/issues.html#new-issue), however this "requires administrator or project owner rights". This means that we'd need to run the creation as admin, and this creates a new problem with the "author" property, which would be the user who created the issue, which is a property that can NOT be set on edit (https://docs.gitlab.com/ee/api/issues.html#edit-issue). So this creates a new problem as we'd lose the context of who created the issue.
The way the scripts are designed is to create the new issue and then to create all the needed URL redirects. So somebody going to drupal.org/i/12345 for an old issue will be redirected to the new place. These redirects will be migrated to the new D10 site, so I think it's just a matter of agreeing to this convention, which I think it's even easier than git.drupalcode.org/project/PROJECT/issues/IID
- 🇺🇸United States dww
I’m 💯 for the redirects. I was hoping to have both.
That’s really too bad that you lose who created the issues if you set the ids. What an sad limitation of GL. Maybe there’s a work-around?
Apparently I’m having trouble explaining my concern. I believe it’s going to be confusing and annoying to have to both check two places (the d.o redirect and the GL link), and not really know if a given ID refers to one or the other. I guess we’ll have to start checking dates in commits, too, and memorize the date the migration ran live. There won’t be d.o redirects for new issues, so if I see “12345” in a post-GL commit, I can’t use the “easier URLs”, anyway.
- 🇫🇷France fgm Paris, France
Or we could change conventions and ask that issues created on GL be called something else than issue, e.g. "ticket". That way any "issue #xx" would be on d.o. and "ticket #yy" would be on GL.
- 🇸🇰Slovakia poker10
Redirects are great, but I personally feel that these are not enough and that without other connection, this will be very fragile. Imagine that we somehow lose one or more redirects (which I think can happen if these will be editable on d.o. as other content redirects), then there will be no way for users to find that particular issue in Gitlab.
I think there are many places where issue IDs are mentioned just as a plain text (like #123456, without square brackets) and for less experienced contributors it could be hard to find the referenced issue.
I am also wondering if keeping the ID would help directly in GitlabUI, because we still need to resolve/convert strings [#XXX] to the new Gitlab issue IDs, so there would have to be a conversion in place in addition to the redirects (if I understand that correctly). In case we keep the IDs the same, wouldn't we match the Gitlab method of referencing and would it be possible to omit the conversion?
So I agree with @dww and prefer to keep IDs, so we can hopefully find some workaround for that issue with authors (as I think that loosing the information about the issue author is a no-go).
- 🇺🇸United States dww
fgm: indeed, I asked/proposed that in the Slack meeting. Sorry I didn’t mention it here. It’s still a bit of cognitive load, but it’s less than checking dates, for sure. But then we have to re-train everyone on commit message conventions…
- 🇺🇸United States moshe weitzman Boston, MA
The redirect controller could forward unknown issue ids to the gitlab url, without any mapping. That way there is one url to check.
- 🇸🇰Slovakia poker10
Looking at the code of the MR here, redirects seems to be created as a standard redirects via a redirect module.
Another point - we also have links comment/XX, which are redirecting to the concrete comment in the issue. Is this redirect solved for referencing migrated comments as well?
- 🇪🇸Spain fjgarlin
https://www.drupal.org/comment/15302889 → redirects to https://www.drupal.org/project/drupalorg/issues/3295357#comment-15302889 📌 Migrate drupal.org issues to gitlab issues Needs review .
These redirects are something that we will need to decide if we are going to migrate or not. So far, nothing was written for it.Right now, when migrating comments to notes, we add as part of the description of the new note
Migrated from [comment #%s (#%s)](%s)
which has the internal comment order and the CID. - 🇺🇸United States dww
@moshe re:
The redirect controller could forward unknown issue ids to the gitlab url, without any mapping. That way there is one url to check.
All kinds of flaws with that proposal:
- That only works if we assume that Drupal Core is the only project for which this matters. The GL URLs need to include a project name. "12345" could be from any project.
- If we don't preserve the IDs, "12345" could be both a valid d.o legacy issue ID, and a rolled over new GitLab ID. So d.o/i/12345 will redirect me to the GL issue for that old NID, but I still might end up on some irrelevant issue if I really needed to be at GL/$project/issues/12345 instead. The only ways for me to know are to compare dates, or re-train every Drupal contributor on commit message conventions (which we should do for other reasons, but that's out of scope here 😅).
- It sounds like there isn't a redirect controller, just a bunch of redirect content being generated, so there's not (currently) a way to implement your idea, even if it could work.
Meanwhile, @poker10's concerns about the longevity of all that redirect content is a great one. Makes me think that instead of just generating redirects directly, we should have a redirect controller, and a
{drupalorg_gitlab_issue_map}
table (or whatever) with all the legacy d.o NIDs -> project + GL IID values. Then no one can break the redirects via the UI. And we'll have a canonical remap table to use going forward in case we need it for other things. I can't imagine the cost of having such a table in our DB is too great for all the potential benefits that would come from doing it that way. It'd probably a more efficient DB-storage than storing all the redirects separately, in fact.If there's no way to work around GL's limitation that we can't programatically set both IID + Author to what we want, how about we create a new field called "Original Author" or something? So the formal Author on migrated GL issues would be the "admin" user, but we can at least know the regular d.o user that created the legacy issue? Going forward, "Original Author" wouldn't be set on new GL issues, or we automatically set it to the "Author", or whatever. Sort of a PITA, but IMHO less painful than having issue IDs colliding.
- 🇺🇸United States drumm NY, US
Makes me think that instead of just generating redirects directly, we should have a redirect controller, and a {drupalorg_gitlab_issue_map} table (or whatever) with all the legacy d.o NIDs -> project + GL IID values. Then no one can break the redirects via the UI. And we'll have a canonical remap table to use going forward in case we need it for other things.
In my experience, the risk of a redirect controller breaking in code updates is higher. Unless we get better at test-driven-development for Drupal.org, it will break at some point and go unnoticed. Using the common redirect module ensures we have less code to maintain and upgrade. And I don’t expect people to be editing these redirects in the UI.
- 🇺🇸United States dww
And I don’t expect people to be editing these redirects in the UI.
It's not about what we expect people to do, it's about preventing the possibility of changing these, either by accident or malice. This is about data integrity.
It's like the map tables from Drupal migrations. Even if you don't intend to write any code to consume it, the cost of generating that table as part of this migration is almost nil. The possibility that it will really come in handy exists.
If, 3 years from now, we want to check if all the redirects still exist, the table would let us. If we decide we prefer a custom controller for some reason(s), we could.
- 🇪🇸Spain fjgarlin
I have a task in my backlog to adapt the IID of the migrated issue to that of the NID. From what we've read (see links in #16), it is possible, and I will try to achieve it. I just haven't coded it yet.
- 🇸🇰Slovakia poker10
I have mentioned this on Slack, but will post it here too.
Not sure what is the exact status of the MR (I have not reviewed it), but from what I remember, there were some open concerns about keeping the issues IDs, how issue metadata should be migrated ( 🌱 Using GitLab labels for issues on Drupal projects Active ), and similar. @fjgarlin when you will have time to work on this again, it will be great to update the current status (what is done, what decisions are needed and what work is needed).
Given that this will be a one-time migration (for each project), without an option to revert (unlike the GitlabCI migration), I think it would be great, if we can see at least one "demo" project migrated first. We will be able to evaluate and test, if everything went correctly, if all references to issues, comments, etc, are kept and working, how the meta info are migrated, etc. Some feedback could be collected this way before the real migrations via opt-in process will start. Thanks!
- 🇪🇸Spain fjgarlin
Re IID, there is some code that needs to be tested. I just added it to the MR as a comment for clarity.
We will have an opt-in for projects, and initially, we have mechanisms to revert the migration in case something goes really bad. But yeah, we will need to test with a handful of projects and then test references and a few other things before doing more projects.
Once I come back to work on this, I'll mention it here and we can see what the next steps are.