Configuration of user-specific Advanced URLs
Contents |
FirstSpirit provides interfaces and a reference implementation (“Advanced URL Creator”) which support the integration of different path strategies for generating URLs into FirstSpirit. The generation of the URLs is delegated to a UrlFactory. Alongside the reference implementation to improve URL generation, (“Advanced URLs” setting in the project properties), these interfaces can be used to implement new customer-specific path strategies and integrate them into FirstSpirit as modules.
All classes which implement the UrlFactory interface support the passing of parameters and values for configuring URL generation. Class-internal fields for saving the configuration setting are defined to make this data – which is made available to the UrlFactory.init(…) method – persistent:
public class UrlCreatorExample implements UrlFactory {
// Fields for persistence during the lifetime of the object.
private PathLookup _pathLookup;
private boolean _useWelcomeFilenames;
public void init(final Map<String, String> settings, final PathLookup pathLookup) {
_pathLookup = pathLookup;
_useWelcomeFilenames = settings.get("usewelcomefilenames");
....
}
...
}
The init(…) method is called every time a UrlFactory object is instantiated so that the object can be initialized with data that cannot change during a generation process. When the method is called, a map object containing the configuration parameters is made available along with a PathLookup object which can be used to query user-defined URLs on Store directories.
For an example application, see reference implementation “Advanced URL Creator”.
Passing of configuration parameters
All configuration parameters must be defined either using a schedule script that is executed BEFORE the actual generation schedule or with the configuration settings in the module.xml file.
Passing with script:
Passing with predefined script in schedule management (in the project properties), e.g.,
context.setProperty("#urlCreatorSettings", Collections.singletonMap("usewelcomefilenames", "true"));
If several parameters are to be passed, this is achieved by means of a map:
factorySettings = new HashMap();
factorySettings.put("useregistry","true");
factorySettings.put("useLowercase","true");
factorySettings.put("removedeleted","true");
context.setProperty("#urlCreatorSettings", factorySettings);
Further standard configuration parameters and other user-defined parameters can be passed in the same way. These parameters must subsequently be passed to the corresponding URL factory implementation (see above).
Passing with module.xml and / or module-isolated.xml:
Instead of being passed in an upstream script, the configuration parameters can be defined in the module.xml (legacy) or module-isolated.xml file within the <configuration> tags, e.g.:
<module>
<name>myConfiguredUrlCreator</name>
<version>0.1</version>
<description>my configured URL Creator</description>
<vendor>myCompanyName</vendor>
<components>
<public>
<name>ConfiguredAdvancedUrlCreator</name>
<class>de.espirit.firstspirit.generate.UrlCreatorSpecification</class>
<configuration>
<UrlFactory>de.espirit.firstspirit.generate.AdvancedUrlFactory</UrlFactory>
<useLowercase>true</useLowercase>
<useWelcomeFilenames>true</useWelcomeFilenames>
</configuration>
</public>
</components>
</module>
Standard configuration parameters
The FirstSpirit framework evaluates some standard parameters:
- useWelcomeFileNames
- stripWelcomeFileNames
- useIRIs
- useRegistry
- readOnlyRegistry
- removeDeleted
- useLowercase
- selfLink
(The parameters are not case sensitive, i.e., useIRIs, UseIris, or UseIRIs are all treated as the same parameter.)
useWelcomeFileNames
The “useWelcomeFileNames” parameter can be used to configure start page references.
context.setProperty("#urlCreatorSettings", Collections.singletonMap("usewelcomefilenames", "true"));
This parameter is used in the Advanced URL Creator reference implementation (see above).
Possible values:
- true or yes or value not set (default value):
Only the first HTML template set uses Welcomefilenames. - false or no:
No Welcomefilenames are used. - all:
All HTML template sets use Welcomefilenames.
This type of configuration can lead to identical URLs, see (*). - Comma-separated list of template sets:
All listed channels use Welcomefilenames.
This type of configuration can lead to identical URLs, see (*).
true or yes or value not set (default value):
If the parameter is passed with the value true (default value), the file name index.* is provided for page references which are marked as the start page of a folder in the Site Store when an Advanced URL is generated (regardless of the display name or the file name from the properties dialog).
This applies to the first HTML template set (e.g., html).
In standard URL generation mode,
../de/startpage/firstspirit_hybrid_cms.html
../en/startpage/firstspirit_hybrid_cms.html
become the following in advanced mode with “useWelcomeFileNames”:
../Startseite/index.html
../Startpage/index.html
For all other template sets (e.g., php) and for page references that are not identified as a start page, the URLs continue to be generated based on the display name of the page reference (with the blank space being replaced by a “-” character).
false or no:
If the parameter is passed with the value false, regardless of whether a page reference is a start page or not, the URLs are generated based on the display name of the page reference (with the blank space being replaced by a “-” character):
../Startseite/FirstSpirit-Hybrid-CMS.html
../Startpage/FirstSpirit-Hybrid-CMS.html
all:
If the parameter is passed with the value all, the file name index.* is provided for page references which are marked as the start page of a folder in the Site Store when an Advanced URL is generated (regardless of the display name or the file name from the properties dialog).
This applies to all HTML template sets contained in the project.
This type of configuration can lead to identical URLs, see (*).
List of template sets:
If a list of template sets is passed to the parameter, the file name index.* is provided for page references which are marked as the start page of a folder in the Site Store when an Advanced URL is generated (regardless of the display name or the file name from the properties dialog).
This applies to all HTML template sets that are included in the list.
The name of the template set is given here (template set name: see Template sets (→Documentation for Administrators)).
This type of configuration can lead to identical URLs, see (*).
(*) With a configuration that uses “WelcomeFileNames” for all or multiple template sets, a folder can contain several index.* files (e.g., “/index.html” and “/index.php”). If the /index.* extensions are then also removed via “stripWelcomeFileNames”, this will result in identical URLs. It is strongly advised not to use this type of configuration. |
stripWelcomeFileNames
The “stripWelcomeFileNames” parameter is only relevant if the URL path strategy used also uses the useWelcomeFileNames configuration parameter. The “stripWelcomeFileNames” parameter can be used to remove the “/index.*” extension added by “useWelcomeFileNames” from the Advanced URL (but not from the file name under which the file is stored in the file system).
factorySettings = new HashMap();
factorySettings.put("usewelcomefilenames", "true");
factorySettings.put("stripwelcomefilenames", "true");
context.setProperty("#urlCreatorSettings", factorySettings);
Possible values:
- true or yes or value not set (default value):
/index.* (usually “/index.html”) is shortened. - false or no:
URL is not shortened - List of extensions:
All listed extensions (e.g., “/index.html” and “/index.php”) are shortened.
true or yes or value not set (default value):
If the “useWelcomeFileNames” and “stripWelcomeFileNames” parameters are passed with the value true (default value), the start page of the Startpage folder is created in the file system with the extension /index.*, but can (if the web server has been configured accordingly) be called via the “/Startpage/” URL.
In standard URL generation mode,
../de/startpage/firstspirit_hybrid_cms.html
../en/startpage/firstspirit_hybrid_cms.html
become the following (in the file system) in advanced mode with “useWelcomeFileNames” and “stripWelcomeFileNames”:
../Startseite/index.html
../Startpage/index.html
and the Advanced URL becomes:
../Startseite
../Startpage
false or no:
If the “stripWelcomeFileNames” parameter is passed with the value false, the extension /index.* is retained for all start page references both in the file system and in the Advanced URL.
List of extensions:
If a list of extensions is passed to the parameter, the extension /index.* is shortened for page references which are marked as the start page of a folder in the Site Store when an Advanced URL is generated and, with the right web server configuration, can be called via the relevant URL (in a similar way to the behavior with the value true).
The names of the extensions are given here (target file extension: see Template sets (→Documentation for Administrators)).
useIRIs
Where object names are concerned, FirstSpirit draws a strict distinction between display names (not unique, can be edited in multiple languages, support Unicode) and reference names (unique within the namespace, restricted to letters and numbers, i.e., do not support Unicode). While display names are relevant for editorial work and can be changed at any time by the editor, reference names are usually only needed by the template developer or for the purpose of internal intervention in the system and cannot be changed (or can only be changed with great difficulty). This two-layer naming convention has proven its worth in practice, but does lead to reference names having to be used in some places. For example, according to the specification, URLs can only contain US ASCII characters, whereas non-ASCII characters can only be encoded with UTF-8.
Thus, when Advanced URLs are generated, they are based on the display name of the FirstSpirit objects, which may contain illegal characters (umlauts, blank spaces, or similar). Some illegal characters are replaced one-to-one when Advanced URLs are generated (regardless of configuration with “useIRIs”). Accordingly, as a general rule, leading and trailing white spaces are removed from URLs and following characters as well as other spaces are replaced by (-):
\ / , : ; * ? " < > | # @ = & + % $
The “useIRIs” parameter (default value “true”) is used to generate all URLs in UTF-8, inclusive of spaces and special characters.
context.setProperty("#urlCreatorSettings", Collections.singletonMap("useiris", "true"));
In standard URL generation mode,
../de/marketing/aboutus.html
../en/marketing/aboutus.html
become the following in advanced mode with “useIRIs”:
../Marketing/Über-uns.html
../Marketing/About-us.html
and the following in advanced mode without “useIRIs”:
../Marketing/%C3%9Cber-uns.html
../Marketing/About-us.html
The “useIRIs” parameter does not affect file names, only the Advanced URL generated (in the same way as the stripWelcomeFileNames) parameter. |
When triggered by $CMS_REF(...)$/ref(...), the entry of useIRIs=false does not affect characters defined via selfLink. For example: The configuration useIRIs=false selfLink="?" does not lead to <a href="%3F">, but remains <a href="?">. |
useRegistry
The URL generation for a project can be manipulated via the “Global Settings” in FirstSpirit SiteArchitect. For example, "short URLs" for certain project content (e.g., for “landing pages”) can be configured here. This makes it possible to generate short, “meaningful” URLs that are easily remembered for certain content, in addition to the normal URLs. These concepts (short URLs, SEO URLs) are only effective if the Advanced URLs are saved in a persistence structure in which each object is assigned a URL that is unique across the entire project.
If the “useRegistry” parameter is passed with the value “true” (default value), all new Advanced URLs generated are saved in the project registry. In this case, the underlying URL Creator implementation can read the URLs from the persistence structure and/or save new URLs in the persistence structure.
context.setProperty("#urlCreatorSettings", Collections.singletonMap("useregistry", "true"));
The registry can grow significantly during the lifetime of a project, since URLs are retained once they have been saved. If, for example, an object is deleted from within a project, the URL saved for this object is retained so that the URL can be restored automatically if the object is restored (see also Saving and resetting URLs for more information). The deletion of saved Advanced URLs can be configured with the “removeDeleted” parameter. |
removeDeleted
The “removeDeleted” parameter is only relevant if the URL path strategy used also uses the “useRegistry” configuration parameter.
factorySettings = new HashMap();
factorySettings.put("useregistry", "true");
factorySettings.put("removedeleted", "true");
context.setProperty("#urlCreatorSettings", factorySettings);
If the “removeDeleted” parameter is passed with the value “true” (default value “false”), URLs on objects which have already been deleted can also be removed from the registry. The standard procedure is that the Advanced URLs are retained in the project registry so that if a deleted object is restored in the project, the URL generated for this object can also be restored automatically.
In worst-case scenarios,the registry can grow significantly during the lifetime of a project. If, for example, a page reference in a project is deleted which contains a PageGroup relating to a content projection, many thousands of saved URLs can quickly be associated with this page reference (even via deleted datasets).
Another application is the resolving of URL conflicts in a project. The use of the “useWelcomeFileNames” parameter can lead to problems, for example. Since URLs are assigned once and are unique within a project, once an “index” entry has been generated for a start page, it is retained, even if the associated page reference has already been removed from the project. New start pages are then automatically assigned a consecutive number instead of
../Startpage/index.html
and therefore become:
../Startpage/index.1.html
To resolve this conflict, the old index entry must first be removed from the registry.
Removing an Advanced URL from the persistence structure can potentially lead to bookmarks and external links to this URL losing their validity. |
readOnlyRegistry
This parameter enables the read-only use of URLs stored in the registry:
If it is passed with the value true (default value: false), wexisting URLs will continue to be read from the registry, but new URLs will not be added to it.
The parameter can thus be used for generations that should not generate permanent URLs (e.g. generations of the current state on a test server) and thus to check the generated URLs in advance.
In order for the readOnlyRegistry to be evaluated, the useRegistry parameter must be set to true.
useLowercase
In Windows, it does not matter whether upper or lower case is used for files and folders, i.e., the folders “abc”, “Abc”, and “ABC” are the same folder.
By contrast, Unix file systems do differentiate between file and folder names (that are otherwise identical) on the basis of case. As a result, conflicts may occur if the files are – for example – created under Linux and then transferred to a Windows system.
You can use the “useLowercase” parameter to control the case that is used for files and folders during generation:
If the parameter is passed with the value set to “false”, the case is taken into account in the project when Advanced URLs are generated. This is the default behavior.
If the parameter is passed with the value set to “true”, lower case is used when generating the file and folder names.
Example:
A project contains two structure folders (menu levels) with the display names “Folder” and “folder”.
If generation is performed with the value of “useLowercase” set to “false”, the following Advanced URLs are created:
/folder/index.html
/Folder/index.html
If generation is performed with the value of “useLowercase” set to “true”, the following Advanced URLs are created:
/folder/index.html
/folder/index-2.html
selfLink
The following always applies, regardless of whether the reference implementation (“Advanced URL Creator”) is used or a self-created customer-specific path strategy is implemented on the basis of the new interface and integrated into FirstSpirit as a module: In the case of self-referencing (i.e., when the page currently being generated contains a reference to itself, e.g., within the navigation), the name part is returned as the URL by default, e.g., "index.html".
The selfLink configuration parameter can be used to configure a constant string, which is returned in the case of self-referencing.
The following values are allowed:
- " " (empty string): If this value is set, an empty string is returned in the case of self-referencing.
- Other permitted values are "?" and "#", which (depending on the browser) are likewise interpreted as references to the current page.
Note: Not all browsers can handle “empty” links (" " (empty string)). Cases have come to light in which the browser fails to interpret an empty link as a link to the currently displayed URL and instead treats it as a link to the directory path where the generated file is stored. Instead of reloading from "/index.html", the browser attempts to open the current directory "/". In such cases, the relevant pages may not be reloaded if a reference is output in the HTML template set via the $CMS_REF(..)$ expression in an <a href> tag. This is particularly problematic as far as dynamic web pages are concerned. In conjunction with the parameter setting useWelcomeFilenames=false (see above), this behavior will ultimately result in an http 404 error, because no index file can be found.
In FirstSpirit versions < 5.2, an empty string " " was returned as the URL in the case of self-referencing.
For information on use with useIRIs, see above.