Prevent creation of pages with "/" and "\" in the name

Hi!
By default the Tomcat configuration doesn’t allow the characters “/” and " \ " in the page name , which leads to inaccessible pages in XWiki.
"Page not found! " message is displayed and the page can’t be rename without a script.
This issue is very often encountered and the vast majority of use cases don’t require “/” or " \ " in page names (not page title) . This behavior does more harm than good.
I propose to change the current behavior and not allow creation of the pages with these characters in the name.

Do you have any opinions on this?

Thanks.

+1

I don’t like it (because it artificially restricts xwiki’s capabilities) but so many people experience the issue that I think we should do that (ie have a default config that works OOB with default Tomcat). And I’m pretty sure that the vast majority of use cases don’t require “/” or “” in page names, so it would be ok).

More generally we should offer the ability for the user to specify characters that would be stripped out in the Admin UI (and/or propose several options for cleaning up). And have “/” and “” be excluded by default.

We should also offer an API/Extension Point so that a developer could contribute some custom logic to generate good page names (using AI, custom cleanup, etc).

Coming back on this topic, the discussion continued a bit on the JIRA issue (https://jira.xwiki.org/browse/XWIKI-16861)
I’m +1 with the general idea, now I have questions regarding the details. The main one being: do we want it to be prevented on backend side or only on frontend side?

@tmortagne already said on JIRA that it should be done only on frontend, WDYT?

Regarding what kind of options we should provide for this feature, @vmassol developed a bit too on JIRA, saying:

I would go further than this. We need to offer extension points so that admins can provide different cleaning strategy for page names (we’ve had the need expressed several times) - https://forum.xwiki.org/t/prevent-creation-of-pages-with-and-in-the-name/5852/3 . We would offer just one basic implementation based on removing forbidden characters (configurable in the Admin UI and/or xwiki configuration files). What’s important is that it can be replaced by customizations. Actually it’s not just an OR but can also be an AND. IMO the best is to list in the Admin UI all existing filters applied for page names with the ability to enable/disable them.
Example of strategies other than stripping forbidden characters:

  • Replace all spaces by dashes ("-").
  • Shorten page name to a max size
  • Transform into CamelCase syntax

WDYT?

Quite frankly I’m not quite sure how to allow such extension point at frontend level with the current architecture, especially since it appears we don’t necessarily use the same standard way to put names everywhere (e.g. AWM names or user names).

Now I guess we could change the architecture to rely more on the server for all names: basically to have a component to submit name to and to receive a filtered name as answer. It wouldn’t be difficult to do, and it would allow easily different strategies, but I’m a bit afraid about the performance penalty it would cause.

I think there are 2 extension points needed:

  1. On the server side: Validate that page creation go through a list of components. Easy to do and we have the component architecture already in place for that. All that would be needed would be for CreateAction to get the list of enabled component hints and call them one after another (possibly in a defined order).
  2. On the client side: Validation in JS to prevent sending page creation and provide users with a nice early warning.

Note that the most important is 1). 2) is the icing on the cake and in any case the page creation endpoints (be them the UI, REST API) should handle errors and report them nicely to the user.

If we want to do 2), it would indeed mean duplicating filtering logic in JS and finding a way to introduce extensibility in JS (maybe we can do that already I don’t know).

So we have several solutions:
A - Only implement 1)
B - Implement 1) and 2) (with logic duplication)
C - Introduce a back end service for filtering page name and call it from the page creation REST endpoint and from the UI (JS). I think this is what you were suggesting too @surli in your last paragraph above.

I think my preference goes to C.

WDYT?

Looks like you missed the main question:

I have questions regarding the details. The main one being: do we want it to be prevented on backend side or only on frontend side?

but I can infer that you want it to be done on backend side.
So if we go to put it on backend side, do we put it only for CreateAction or also on SaveAction but with a check that it’s a new page before performing name filtering, in order to allow editing pages already created even if their name does not comply with new policies?

Yes indeed.

Yes

What is SaveAction doing compared to CreateAction?

We also need to decide if we want to:

  • validate page names (ie refuse when the name isn’t valid)
  • filter page names (ie always accept but modify the page name by going

EDIT: See discussion at https://matrix.to/#/!ikPtGZaGWtyblizzlR:matrix.xwiki.com/$15780472522479RcYoK:matrix.xwiki.com?via=matrix.xwiki.com&via=matrix.org&via=matrix.wina.be

So after discussion about the potential problems it could cause to have a name transformation in the backend (see link for matrix discussion above), the new proposal is the following:

To create a central component that allows different strategies for naming. Each strategy being defined at two level:

  • a validation against a pattern that can be used in backend
  • a transformation that takes a name and transform it to validate the pattern, that can be used in frontend

The idea is to start using this component at UI level to transform the names, to start implement it at backend level at least for CreateAction (and maybe SaveAction) to validate the names, but to deactivate by default this validation since it might be harmful. It could be activated through the admin with a warning about the experimental feature.

First strategies to implement would be about forbidden characters (configurable to chose which characters to forbid) and slug names (transform each non ascii character to an equivalent).

WDYT?

+1. The UI should do live (delayed) server-side validation by calling this component.

What about the rename operation? It should be protected also.

Yep on UI side it will be protected too on the first iteration. For backend I guess we’ll see later.