DMN 1.3 RTF Avatar
  1. OMG Issue

DMN13 — Should add a new unicode escape syntax

  • Key: DMN13-143
  • Status: closed  
  • Source: fujitsu america ( keith swenson)
  • Summary:

    On page 110, rule #66 gives these escape sequences:

    string escape sequence = "\'" | "\"" | "
    " | "\n" | "\r" | "\t" | "\u", hex digit, hex digit, hex digit, hex digit;

    Using the hex character value notation "\uD41A" you can only four hex digits which is not enough to specify the entire Unicode character range. The spec specifically mentions that much higher values are allowed, and there should be a standard way to specify them without having to use the cryptic surrogate character pairs.

    This suggestion is to add braces to allow variable length hex values in exactly the way that JavaScript allows:

    "\u

    {1F40E}

    "

    This is character 128,014 in the unicode set. To specify this character using surrogate pairs you would have to use: "\uD83D\uDC0E" which is complicated and confusing and is currently subject to discussion on whether this is valid or not within the current spec.

    "A" = "\u

    {41}

    "

    "è" = "\u

    {E8}

    "

    "査" = "\u

    {67FB}

    "

    "" = "\u

    {1F407}

    " (Jira seems to crash if I inlcude this character)

    It is nice because you actually use the Unicode code point value, and because of the braces you can specify a variable number of hex digits so the entire Unicode set can be easily addressed. The precidence for this syntax is set by JavaScript (

    https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf

    See page 218 where it defines this (slightly translated):

    UnicodeEscapeSequence ::
    \u Hex4Digits
    \u

    { CodePoint }

    This will not conflict with any existing implementation since the opening brace was not allowed in earlier implementations. The new allowed syntax could be used in DMN 1.3 and higher.

  • Reported: DMN 1.2b1 — Fri, 8 Feb 2019 19:06 GMT
  • Disposition: Duplicate or Merged — DMN 1.3
  • Disposition Summary:

    Addressed by resolution of DMN-127

    DMN-127 adds \U, 6*hex digit to address this issue as well

  • Updated: Tue, 26 Jan 2021 20:17 GMT