Skip to content

Race conditions

Dawid Niezgódka edited this page Feb 9, 2023 · 5 revisions

This page describes the attempts that were taken to solve the problem of race conditions.

According to the analysis of the current code basis, race conditions can occur in numerous scenarios, in particular:

  1. Two users simultaneously update the tags of the same spec item. One of them will commit first and persist their changes to the database. The second will commit later and overwrite the changes made by the first operation. Thus, there is a possibility of a silent data loss update.
  2. One user wants to update a spec item, for example, add some tags, while the other wants to delete it (a further investigation is needed here).
  3. Multiple users add the same documents that contain identical spec items.

The version of the software involved in the main branch, does not solve the race conditions in the tag addition function. The attempt to solve the problem is presented on a separate branch called dev_raceConditions. Some test-template for checking the race conditions are provided in the code (TaggingMechanismTest, TaggingRaceConditionsTest), but for the reasons depicted below, they are not successful.

The third problem has been solved adequately. The solution to the first and the second is only partial and unstable. Further work is necessary to arrive at a satisfactory answer to the problem.

The main obstacle in handling the situation turned out to be a mix of the choice of the primary key and the requirement of the versioning system. The need to create a new version of the SpecItem every time a change was made led to the challenge of solving the race conditions as the reference point constantly changed. For example, if adding tags to a SpecItem (i.e.:, clicking on the add tags button) immediately send the information about the new version (new commit time, and thus a new version of SpecItem), there was no way (at least to our knowledge) way to use a counter-mechanism such as optimistic locking. Thus, the first idea was to always refer to the commit time of the updated SpecItem instead of immediately creating a new version of it.

The snippets below will present the approach for solving the race conditions using so-called Optimistic Locking. Optimistic Locking is a mechanism that enables checking an updated entity's version property before the transaction is committed. If there is a discrepancy between the versions, it means that someone has updated the entity in the meantime, and there is a risk of silent data loss. Specifically, in Java, one can use the mechanism by setting the @Version annotation on a field that belongs to a class marked as @Entity. Each entity update shall result in a new version (new versions are set automatically - there is no need for a developer to be involved). Next, in a method where race conditions can occur, one creates a try-catch block and tries to handle the ObjectOptimisticLockingFailureException. If the version has changed, this exception will be thrown.

The first component is the completeTagAdditionProcess function, which is called from the Controller after the button Save Tags is clicked via the GUI. The saveTags function (presented below) is surrounded by the try-catch block, where the catch block is on the lookout for the ObjectOptimisticLockingFailureException.

    public TagInfo completeTagAdditionProcess(final SpecItem taggedSpecItem, final String newTags) throws InterruptedException {
        final LocalDateTime newCommitTime = LocalDateTime.now();
        TagInfo result;
        boolean wentThroughLockingFallback = false;
        try {
            log.info("Saving the tags: {} for SpecItem with ID:{} and CommitTime: {}",
                newTags, taggedSpecItem.getShortName(), taggedSpecItem.getCommitTime());
            result = this.tagService.saveTags(taggedSpecItem.getShortName(), taggedSpecItem.getCommitTime(), newTags, false);

        } catch (ObjectOptimisticLockingFailureException lockingFailureException) {
            log.info("There was a concurrent update. The new version will be saved.");
            // 1. Wait a bit
            Thread.sleep(3000);
            // 2. Get the tags for the item that caused the locking (ID, Old)
            final TagInfo currentTagOfTaggedSpecItem = this.tagService.getTagsBySpecItemIdAndCommitTime(
                taggedSpecItem.getShortName(), taggedSpecItem.getCommitTime());
            // 3. Save the tag info of the new version as the version has been increased,
            // and it is not possible to save under the same primary key
            String allTags = currentTagOfTaggedSpecItem.getTags() + ", " + newTags;
            result = this.tagService.saveTags(taggedSpecItem.getShortName(), newCommitTime, allTags, true);
            wentThroughLockingFallback = true;
        }
        if (wentThroughLockingFallback) {
            this.createAndSaveNewVersion(taggedSpecItem, LocalDateTime.now());
        } else {
            this.createAndSaveNewVersion(taggedSpecItem, newCommitTime);
        }
        return result;
    }

The saveTags function:

@Transactional(propagation = Propagation.REQUIRES_NEW)
    @Override
    public TagInfo saveTags(final String specItemShortName, final LocalDateTime specItemCommitTime,
                            final String tags, boolean isLockingScenario) {
        final TagInfo existingTagInfo = this.getLatestById(specItemShortName);
        String allTags;
        if (existingTagInfo != null) {
            if (existingTagInfo.getTags().isEmpty()) {
                allTags = tags;
            } else {
                if (existingTagInfo.getTags().length() > tags.length()) {
                    allTags = tags;
                } else {
                    allTags = existingTagInfo.getTags() + ", " + tags;
                }
            }
            allTags = removeDuplicates(allTags);
            existingTagInfo.setTags(allTags);
            return handleSaveAccordingToLockingScenario(specItemCommitTime, isLockingScenario, existingTagInfo);
        } else {
            final TagInfo newTagInfo = new TagInfo();
            newTagInfo.setCommitTime(specItemCommitTime);
            newTagInfo.setShortName(specItemShortName);
            newTagInfo.setTags(tags);
            this.tagsRepo.saveAndFlush(newTagInfo);
            return handleSaveAccordingToLockingScenario(specItemCommitTime, isLockingScenario, newTagInfo);
        }
    }

...and the handleSaveAccordingToLockingScenario:

    private TagInfo handleSaveAccordingToLockingScenario(final LocalDateTime specItemCommitTime, final boolean isLockingScenario,
                                                         final TagInfo newTagInfo) {
        if (isLockingScenario) {
            log.info("Saving the spec item as a res of res of locking. Commit time: {}", specItemCommitTime);
            return this.entityManager.merge(newTagInfo);
        } else {
            log.info("Saving the spec item normally. Commit time: {}", specItemCommitTime);
            return this.tagsRepo.saveAndFlush(newTagInfo);
        }
    }

Consider now the scenario where two new tags are added in parallel, for example via curl (some properties of SpecItem are omitted for brevity), i.e.: curl --json '{"tagList":"abc:d2", "shortname":"ID22","commitTime":[2022,2,12,21,49,13]' http://localhost:8080/post/tags and '{"tagList":"abc:d1", "shortname":"ID22","commitTime":[2022,2,12,21,49,13]'. In such a case, the tag abc:d2 is added and the version of the SpecItem with ID=22 and CommitTime=[2022,2,12,21,49,13] is created:

Screenshot 2023-02-07 at 21 53 36

However, the second update is lost even though the functions enter the procedure in the catch block.

log.info("There was a concurrent update. The new version will be saved.");
            // 1. Wait a bit
            Thread.sleep(3000);
            // 2. Get the tags for the item that caused the locking (ID, Old)
            final TagInfo currentTagOfTaggedSpecItem = this.tagService.getTagsBySpecItemIdAndCommitTime(
                taggedSpecItem.getShortName(), taggedSpecItem.getCommitTime());

However, the second update is lost even though the functions enter the procedure in the catch block. In the case of concurrent update, the currentTagOfTaggedSpecItem evaluates to null. However, the tags from the other request have been added, which means there might be some issues with flushing the entity. An attempt to tackle this problem was made. The idea was to introduce the EntityManager in TagServiceImpl to get more control over the saving process:

      if (isLockingScenario) {
            log.info("Saving the spec item as a res of res of locking. Commit time: {}", specItemCommitTime);
            return this.entityManager.merge(newTagInfo);
        } else {
            log.info("Saving the spec item normally. Commit time: {}", specItemCommitTime);
            return this.tagsRepo.saveAndFlush(newTagInfo);
        }

Depending on the scenario (presence or absence of the locking exception), one tried to merge the entity or save it with flushing, unfortunately, to no avail.

Clone this wiki locally