UML 2.5 FTF Avatar
  1. OMG Issue

UML25 — String concrete syntax is missing

  • Key: UML25-267
  • Legacy Issue Number: 17821
  • Status: closed  
  • Source: Model Driven Solutions ( Dr. Edward Willink)
  • Summary:

    Specifying a Concrete Syntax requires just that. It must provide a solution to expressing all characters in a predictable fashion. So given the constraint of "..." encapsulation, how are internal " represented? How are Unicode and newlines expressed? Suggest reusing the OCL 2.3 clarification of backslashes. The concrete syntax comprises a sequence of zero or more characters or escape sequences surrounded by single quote characters. The [B] form with adjacent strings allows a long string literal to be split into fragments or to be written across multiple lines.[A] StringLiteralExpCS ::= #x27 StringChar* #x27
    [B] StringLiteralExpCS[1] ::= StringLiteralExpCS[2] WhiteSpaceChar* #x27 StringChar* #x27

    where

    StringChar ::= Char | EscapeSequence

    WhiteSpaceChar ::= #x09 | #x0a | #x0c | #x0d | #x20

    Char ::= x20-#x26 | x28-#x5B | x5D-#xD7FF | xE000-#xFFFD | x10000-#x10FFFF

    EscapeSequence ::= '\' 'b' – #x08: backspace BS

    '\' 't' – #x09: horizontal tab HT
    '\' 'n' – #x0a: linefeed LF
    '\' 'f' – #x0c: form feed FF
    '\' 'r' – #x0d: carriage return CR
    '\' '"' – #x22: double quote "
    '\' ''' – #x27: single quote '
    '\' '\' – #x5c: backslash \
    '\' 'x' Hex Hex – #x00 to #xFF
    '\' 'u' Hex Hex Hex Hex – #x0000 to #xFFFF

    Hex ::= [0-9] | [A-F] | [a-f]

    A minor change could share the syntax definition and define the body as above prohibiting un-escaped usage of the character used as the surrounding quotes.

  • Reported: UML 2.4.1 — Wed, 26 Sep 2012 04:00 GMT
  • Disposition: Resolved — UML 2.5
  • Disposition Summary:

    The issue statement presumes that characters in a string are encoded as Unicode. However, as a modeling language,
    UML does not presume any specific character set (it is specifically stated in 8.2.4 that “The character set used is
    unspecified.”) Therefore it is not possible to provide specific, standard notation for specific characters, particularly
    unprintable control characters.
    Disposition: Closed - No Change

  • Updated: Fri, 6 Mar 2015 20:59 GMT