Versioning

Process Versioning Background:

  • processes can be directly requested from an Engine and/or from a central storage repository
  • MS can be a desktop application: there are multiple MS on different PCs without same database
    • can be either online (processes can be synchronized with a central storage) or offline (no connection to a central storage)
  • MS can be a server application with one db and multiple users
  • multiple users can work on the same process at the same time collaboratively (only if online)
  • a User can be authenticated at multiple MS in parallel (I would like to avoid this, because this is not easy to achieve)

Process Versioning Requirements for PROCEED:

  • a MS/User must always be able to create/edit/store/deploy a new or available process version at any time

    • either only offline (not synced), or as well online (synced)
  • a MS/User must always be able to request/retrieve a stored process version directly from an Engine (not from other MS) and edit/redeploy it again

    • a User must be able to work offline on processes
  • the collaborative approach must work

  • a process version must be unique, have a name, date of the version creation, the creator of the version, a description, the date of the last change, the author of the last change

  • the name and description of the version must be changeable

  • a new process can be based on a specific version of some other process

  • the versions of a process must be sorted and indicate the relation between versions (tree structure)

  • the versions of a process must be comparable

  • a new process version should be created by hand

    • to add meta data about the version
    • to have not too much and only defined versions of a process (=> collaborative approach: no new version for every change)
  • user friendly: no branches and merges

    • merges probably make no sense for process diagrams

Not a requirement

  • a version should be locally and globally (if online) deleteable
  • every version can be reverted

Use Cases and Goals for Process Editing / Monitoring:

  1. one synced process version on MS1 is deployed to engine, other MS2 (online) gets the version
    1. retrieved version is older than the synced version on MS2, because on MS1 the work was continued
      • Process Edit MS2: let the user select which version to edit
      • Process Monitoring MS2: use retrieved version
    2. retrieved version is the same as the synced version on MS2, because on MS1 the work was stopped
      • Process Edit MS2: let the user select which version to edit
      • Process Monitoring MS2: use retrieved version
  2. one synced process version on MS1 is deployed to engine, other MS2 (offline) gets the version
    1. retrieved version on MS2 is older than the synced version in the repo (which is not synced), because on MS1 the work was continued
    2. retrieved version on MS2 is the same as the synced version in the repo (which is not synced), because on MS1 the work was stopped
      • Process Edit (BOTH CASES) MS2: the user can edit the retrieved version
      • Process Monitoring (BOTH CASES) MS2: use the retrieved version
  3. one not synced process version on MS1 is deployed to engine, other MS2 (online) gets the version
    1. version different on MS2 than on the synced server
      • Process Edit MS2: let the user select which version to edit
      • Process Monitoring MS2: use retrieved version
    2. current version on MS2, because the version already exists on the server
      • Process Edit MS2: let the user select which version to edit
      • Process Monitoring MS2: use retrieved version
    3. new version on MS2, because the version does not already exists on the server
      • Process Edit MS2: edit the retrieved version
      • Process Monitoring MS2: use retrieved version
  4. one not synced process version on MS1 is deployed to engine, other MS2 (offline) gets the version
    1. new or version on MS2, because the version does not already exists
      • Process Edit MS2: edit the retrieved version
      • Process Monitoring MS2: use retrieved version
    2. version different on MS2 than the existing ones
      • Process Edit MS2: let the user select which version to edit
      • Process Monitoring MS2: use retrieved version

Nomenclature

Parent Artifact : The process description which is the BPMN XML file. It contains Sub-Artifacts.

Sub-Artifacts: : Some file/resource needed in the process description, e.g. the HTML file for user tasks, or scripts, constraints, attached documents, etc.

General questions:

What happens if a new process is created? : Answer:

  • new id, etc.
  • new elements for the process:
    • proceed:ownerId and proceed:ownerName
    • if the process is based on another process, delete all collaborators (actually delete the whole meta information inside the extension element for bpmn:process)
    • if the process is based on another process, add this info
  • sync with the server if online

How is a process automatically synchronized with the server? : Answer:

  • a process is only sent to and stored on the server by the user that created the process (i.e. current authenticated user == owner)
    • in case the MS retrieved a process from an Engine, which is not on the server yet and where the owner is not the current user, the MS does not send the process to the server for storage
  • for collaboration the owner can add other users (inside the extension element for bpmn:process <proceed:collaborators> <proceed:collaboratorUser> <proceed:collaboratorGroup>)

: Background:

  • a user can connect the MS to a server (+ authenticate), there he can:
    • store processes and reuse them on another MS (if authenticated)
    • participate in the collaborative editing of a process with other users
  • for user friendliness: the synchronization of every process change and every process version should be done in the background automatically without user interaction (if authenticated, all processes are stored automatically on the server)

What happens if an offline MS works on an existing process (changing the HEAD or even creates a new process version) that was synced before and therefore in the editable process list? : Answer:

  • if( collaborators != empty): there are other collaborators => create a complete new process based on the retrieved version)
  • if( user == owner && collaborators == empty): go on, work on the same process
    • BIG Problem: the same user can even be authenticated on multiple MS in parallel, e.g. Web Client and one MS. What happens if the MS goes offline and work is continued on both "MS"
      • this is actually the same problem with multiple collaborators
      • But it would be very nice for a user if he can also work offline on a process without creating a new process - how to solve this?
        1. if the MS goes online again, check if there was a change in the meantime. if not, sync like described. if yes, give the User the opportunity to choose: Rewrite the server (+create copy), Rewrite the MS (+create copy) => this needs an identifier to indicate the lastHead (not last process version but the last modificationId, because also the unversioned HEAD is stored on the server)
        2. handle process like it has multiple other collaborators: create new processes if offline

What happens if the user was online (synced process with the server), goes offline (not authenticated), created new processed or worked on some of its own processes, then goes online again: : Answer:

  • if( process owner == current user && process == new): send all new processes to the server
    • because of the authentication, the owner maybe has to change for every newly created process
  • if( current user == process collaborator): get processes form the server, override all processes and their version where the user is a collaborator (more efficient: compare if it is already stored)
  • if( process owner == current user && collaborators == empty && process == changed): TODO see problem before

What happens if a MS works on a retrieved process from an Engine (changing the HEAD or even creates a new process version)? : Answer:

  • if (Online MS && (user == owner || user == collaborator) ): the process should already be in the user's process list. So, he can work on it. - if (Online MS && (user != owner || user != collaborator) ): he can only create a complete new process based on the retrieved version
  • if (Offline MS && user == owner && collaborators == not empty): he can only create a complete new process based on the retrieved version
  • if (Offline MS && user == owner && collaborators == empty): no collaborators
    • the process should already be in the user's process list. So, he can work on it. (Reason: if you authenticate, you also sync all the processes with the server (or you are always offline) => the retrieved process should already be in the process list)
      • PROBLEM: TODO the same user can be authenticated on multiple Systems in parallel (Web and MS), see before
  • if (Offline MS && user != owner && user == collaborator): he can only create a complete new process based on the retrieved version
  • if (Offline MS && (user != owner || user != collaborator) ): he can only create a complete new process based on the retrieved version

: Background:

  • preferred would be if the ongoing work is put on the linear process history. but this is problematic because
    • multiple other persons can change the same process in parallel (collaboration approach or other offline MS) => no branches should be created and no merges
  • the MS eventually goes online again, or sends the new version to an Engine (which is then retrieved by another MS already having the process in another version)
    • you can only add the retrieved version as a new version on top of existing one, if: local process version timestamp > lastModificationDate (since the creation of the last version, nothing was changed), then it can be easily integrated
    • if local process version timestamp > lastModificationDate : the process was changed on both MS (or on the Server) => how to combine these different processes?
      • a merge is too complicated, there complete new process

How is a version created? : Answer: Manually, by hand, if there is some reason to save the current status of the process

What is stored if you create a new version?

  • id (unique = date of the version creation in ms)
  • name (short description, can be Autofilled)
  • comment (Optional, long description)
  • author
  • indirect: lastModificationDate, lastModificationDateBeforeOffline

Are there extra attributes important for process versioning? : Answer:

  • proceed:versionBasedOn
  • proceed:ownerId and proceed:ownerName for stating the process owner (usually the creator, but the owner rights can maybe given to another user/role)

Do we need a complete version history on all MS? : Answer: No, it is okay, if one version in the middle of the history is maybe missing : Background:

  • no really good reason for doing that (just fo seeing and tracking the versions, is no killer app)
  • it is very problematic too ensure the correct version history on offline MS
  • it can be a big overhead, if one previous version can be deleted

How can versions of a process be sorted? : Answer: the version contains a timestamp, so it can be sorted after the time (Problem: time synchronization (which can probably not be solved for offline MS))

  • 2. Solution: the version references the previous process version, e.g. with proceed:processIdBasedOn and proceed:versionIdBasedOn
    • Problem: 1. version deletion (see later, can also not be solved for offline MS), 2. to ensure a complete version history in the offline case you need to store the complete history inside the process, because an offline MS does not have every version

What process artifacts need to be versioned? (BPMN, HTML Views, Scripts, Constraints, Machine Assignment, attached (binary) documents like a Specification) : Answer:

  • All textual, process artifacts (BPMN, HTML, Scripts, etc.): external (e.g. HTML) and internal (e.g. Scripts, Constraints) artifacts belong to a version
  • No binary artifacts and no attached documents are versioned

A new version for every new static assignment? : Answer: yes, because a Machine Assignment is actually something like an assignment to a role/participant (this is part of the process editing and should be visible for everyone and stored inside a version)

Do Sub-Artifacts (Scripts, HTML, Constraints, etc.) get (separate) version numbers? : Answer:

  • No, only the BPMN process gets a version number. Artifacts are linked (e.g. as files).
  • But: all linked artifacts get stored with a version. So, for every version there is a copy of all referenced artifacts.

Because we send multiple artifacts to an engine: can it happen that we just update (the version) of only one artifact on the Engine without all other? : Answer: No, a deployment always sends all process artifacts of one version.

Is there a new version created for every deployment? : Answer: it depends: only defined versions get deployed

  • => if the latest process change already has a version, then no new version needs to be created for deployment
  • => if you want to deploy a previous, then no new version needs to be created
  • => if latest changes are not stored as a version, then (automatically) create a new version at deployment

: Background: why are only defined versions deployed?

  • there is no new version created continuously for every change (because of the collaboration approach)
  • if you deploy a process, you usually want to deploy the latest changes
  • if deployed to an Engine and the process is modified during running on the Engine, there needs to be an indicator that the deployed version and the modified version are not the same anymore

How to find out if the version to be deployed is already the newest version or an unversioned change? : Answer:

  • Solution 2: there is a last modification date and author stored within the BPMN (proceed:lastModificationDate + proceed:lastModificationAuthor
    • Problem: time synchronization
  • Solution 1: there is a unique random number (proceed:modificationId) stored within the BPMN
    • this could be the event id, if there is one and if it is unique
    • Problem: no info about the time and author of the last change
  • Both solutions require the info to be set with the change event (not as a second event) - possible? (Reason: for synchronizing we only should send one change event to all other clients, not multiple -- or else maybe leads to problems)
  • this last modification is always the local "HEAD"
  • before deployment it is checked if the HEAD last modification == last version last modification

You can see previous versions of a process. Can you change such a version and what happens to newer versions? : Answer: It is not directly possible to change a specific version of a process. If you want to do it, it always creates a new process (id/name/etc.) based on the specified version

: Background: if you can change a previous version (i.e. not changing the version directly, but creating a new version/HEAD) and the process id/name/meta data stays the same, how to display this branch in the process list? - Solution 1: delete all newer versions => then there is always only one branch of process, i.e. one process with a linear history (Not wanted) - Solution 2: Starting editing a previous version means to create a completely new process (id/name/meta data), which can be displayed as a separate process

What happens if you create a new process based on another process version? : Answer: New id/name/meta data plus two extra attributes proceed:processIdBasedOn and proceed:versionIdBasedOn indicating the base and referring the original/previous version

Can you delete a specific version of a process? : Answer: no : Background:

  • you or just an Admin could do it: the timestamp in the version keeps the sorting correct. This deleting is transferred to the sync server, so it gets deleted on all clients.
  • Problem 1: what happens if an offline MS gets online again and has a deleted or new version of a process that it wants to synchronize?
    • Solution: the version should stay deleted. therefore the server needs to keep a cache/blacklist with all already deleted versions of a process. if a MS syncs a process list, it would then delete the old version
  • Problem 2: other processes can be based on another version of a process -> it is maybe not available anymore

Can you revert to a specific version of a process, i.e. delete all newer version? : Answer: no, not allowed (see version deletion)

Can you delete a complete process with all versions? : Answer: yes, because this is often a business requirement (maybe as a specific right in the user management)

: Problems:

  • What happens with processes deployed on offline MS and Engines?
    • Solution: Keep a list of deleted processes and delete the process if the MS goes online again
    • Problem: What happens if the offline MS already created a new version based on the deleted process?
      • Solution: Create a new process (see "MS offline/online syn problem")

How to start a process on the Engine? : Answer: always start a specific version of the process

How do the other tools do that? (BPMN Method and Style, Camunda, Signavio, ...) : Answer:

  • Camunda creates a new version when a changed process definition is redeployed, running process instances will continue to run in the version they were started in, new process instances will run in the new version - unless specified explicitly. For more info See here (opens in a new tab)
  • BPMN and Signavio have no attribute/element for process versioning

scenarios

  • MS goes from online to offline to online

Use Case Two MS synced

sequenceDiagram participant U1 as User1 participant M1 as MS1 participant E as Engine participant M2 as MS2 participant U2 as User2 U1->>M1: edit/create synced Process note over M1: Process has unique ID

loop U1->>M1: Update process opt U1->>M1: Update Machine Assignment end M1->>M1: sync to server end

opt User clicks "New Version" Button U1->>M1: Create new version M1->>M1: Write Version-Name M1->>M1: Generate Version-ID M1->>M1: sync to server end

alt Dynamic Deployment opt U1->>M1: Select Process Version note over U1,M1: Default: the Named Version that equals HEAD || HEAD end U1->>M1: dynamic deploy selected process version opt Selected Version == HEAD (No-Named Version) M1->>M1: automatically create new Version M1->>M1: sync to server end M1->>E: deploy process

else Static Deployment opt U1->>M1: Select Process Version note over U1,M1: Default: the Named Version that equals HEAD || HEAD end opt Button: Change Machine Assignment U1->>M1: create Machine Assignment opt U1->>M1: create new Version end M1->>M1: sync to server end U1->>M1: static deploy selected process version opt Missing Machine Assignment U1->>M1: create Machine Assignment end opt Selected Version == HEAD (No-Named Version) M1->>M1: automatically create new Version M1->>M1: sync to server end M1->>E: deploy process end

M2->>E: request process E-->>M2: send process artifacts M2->>M2: search if process exists with the id and/or if it is new version alt new process id || new version M2->>M2: sync to server else not new M2->>M2: discard? end

loop U2->>M2: Update process opt U2->>M2: Update Machine Assignment end M2->>M2: sync to server end note over U2,M2: same as for MS1 and User1...

Example Using existing process as template to create a new process

sequenceDiagram User1->>MS1: Save initial BPMN note over MS1: create unique process id note over MS1: create new version loop update BPMN User1->>MS1: save changes to BPMN note over MS1: create new version end User1->>MS1: deploy process to engine opt if machine assignments change (static deployment) note over MS1: create new version end MS1->>Engine: deploy process MS2->>Engine: request process Engine->>MS2: send process artifacts

User2->>MS2: import as new process note over MS2: create unique process id note over MS2: create new version note over MS2: copy all sub-artifacts under the new process loop update BPMN User2->>MS2: save changes to BPMN note over MS2: create new version end loop update BPMN User1->>MS1: save changes to BPMN note over MS1: create new version end opt if machine assignments change (static deployment) note over MS2: create new version end User2->>MS2: deploy process to engine MS2->>Engine: deploy new process version MS1->>Engine: request new process version Engine->>MS1: send process information note over MS1: import new process

Implementation:

  • How and where to describe the versions:

    • how does the version number look like? (date, v1, hash?)

      • Answer: date in ms since 1970 (plus, we also get the date of the last change)
    • where should the version and the references to a versioned artifact be placed in BPMN? (e.g. in process-id or in new version-attribute?)

      • Answer:
      • BPMN file: file name and inside the BPMN XML with proceed:version
      • Sub-Artifacts (Internal): inside the BPMN XML with proceed:version
      • Sub-Artifacts (External): have file/resource name which is references inside the BPMN XML with proceed:version
    • how to version artifacts, that don't have BPMN XML structure? (e.g. HTML) (maybe just via a new URL endpoint, file name, or maybe via some meta HTML tag)

      • Answer: with a new (file/resource) name for every version of an artifact
    • how does santhos BPMN engine handle versions?

      • problem: starting a part of a process must always start it in the correct version. example: a process is started at engine 1 and continued at engine 2 -- it must start the same version
        • this could be purely handled by the PROCEED surrounding
      • Answer: Santhos engine probably doesn't need to handle it
  • how are the versioned artifacts stored on the PROCEED engine and on the MS? (Git?)

    • Answer: the process artifacts are stored in the same way on the MS and on the Engine like this:
      • process-id/ (folder)
        • process-id-version (BPMN)
        • process-id-version (BPMN)
        • user-tasks/ (folder)
          • html-id-version (HTML)
          • html-id-version (HTML)
    • a process version internally references sub-artifacts like the HTML files for user tasks
    • the name of the file/resource contains the version -- it can also be inside the file
    • we don't use git for versioning, because it bring to much complexity
  • Deployment: what should be deployed to an Engine at process deployment?

    • Answer: all artifacts (BPMN and sub-artifacts), even if they didn't change
  • Engine: Deployment and Storage: what does the Engine, if a process deployment comes in?

    • Answer:
      • checks if the process id exists
      • checks if the process version exists
      • checks if the sub-artifacts version combination exists
      • if it is a new process or process version, sends it to the Garbage Collector for updating its reference list
      • overrides the files/resources that already exist
      • depending on the configuration, triggers the Garbage Collector (GC) (must not execute immediately)
  • Storage: how to delete process models, versions and artifacts? (on the MS and on the Engine)

    • delete all versions of a process: simple, just delete the whole process folder/resource, because it contains every artifact and will delete all sub-resources
    • delete a specific version of a process: - Problem: a referenced sub-artifact (e.g. HTML) can only be deleted, if no other process references it anymore - delete the version of the BPMN file and trigger the Garbage Collector
    • delete a specific sub-artifact: should not be possible
  • Storage: what does the Garbage Collector (GC) do?

    • this depends on the Configuration
      • StoreLastFromOwner (Default on Engine): delete all other (older and newer), non-running versions of the process from the same owner
      • StoreLast: delete all other (older and newer), non-running versions of the process
      • StoreAll (Default on MS): keep every process version, just delete every unreferenced sub-artifact
    • the GC runs if regularly
    • the GC must go through all processes and create a reference list
      • the reference list is stored to not be recreated every time, if nothing changed

      • just checks if every file is still there

      • uses the reference list to delete unreferenced files