ArticleML

A simple XML vocabulary for news articles.

ArticleML intentionally leaves out most all metadata (except for some optional slots for embedded metadata). Instead, ArticleML is intended to be used cooperatively with content packaging tools such as NewsML.

Because ArticleML is a clean, concise, consistent, and legacy-free XML vocabulary, writing software and stylesheets for processing ArticleML is particularly easy. For example, writing XSLT transformations to convert ArticleML into html, plain text, pdf, and popular import formats for pagination systems (such as Adobe InDesign's TaggedText and Quark XPress's XPress Tags) is as straightforward as possible.

Examples:
  • ArticleML validatable via DTD
  • ArticleML validatable via Schema
  • Specifications:
  • ArticleML DTD
  • ArticleML Schema: Main File | Supplementary File
  • Documentation:
  • Documentation generated by XML Spy
  • Quick View

    <article>
    
    	<abstract>(container)*</abstract>?
    
    	<headline>
    
    		<superHeadline>(enrichedText)*</superHeadline>?
    
    		<mainHeadline>(enrichedText)*</mainHeadline>
    
    		<subHeadline>(enrichedText)*</subHeadline>*
    
    	</headline>?
    
    	<byline>(enrichedText)*</byline>?
    
    	(container)*
    
    	(closer)?
    
    </article>
    
    
    Whereby container is:
    
          p 		holds (<dateLine> | enrichedText)*
        | subHeadline 	holds enrichedText*
        | table 		holds <tr>, <td>, etc.
        | media	 	holds reference to media object, plus <mediaCredit> and <mediaCaption>
        | list 		holds <listItem>s which holds enrichedText*
        | sidebar 		holds <headline>?, container*
        | pre 		holds enrichedText*
        | editorialNote 	holds enrichedText*
    
    And enrichedText is:
    
          #PCDATA
    
        | phrase 		type = company | person | title | etc.
                            literal = MSFT | 1-55615-678-2 | etc.
                        OR: qcode = company:MSFT
    
        | highlight 	class = bold | italic | etc.
    
        | link		href = some_url
        
        | break
    
    And closer is:
          creditLine	holds enrichedText*
        | bio		holds enrichedText*
    
    And <dateline>		holds enrichedText*
    	
    All elements have id, class, and style attributes.