Data Processor: Data Input and Output Manipulation

This website contains links to software which is either no longer maintained or will be supported only until the end of 2019 (CKFinder 2). For the latest documentation about current CKSource projects, including software like CKEditor 4/CKEditor 5, CKFinder 3, Cloud Services, Letters, Accessibility Checker, please visit the new documentation website.

If you look for an information about very old versions of CKEditor, FCKeditor and CKFinder check also the CKEditor forum, which was closed in 2015. If not, please head to StackOverflow for support.

CKEditor is a browser based editor. It means it uses the browser infrastructure to create its editing environment. Because of this the editing area in the editor is simply an HTML page that's manipulated by the user by typing or by using the editor features. Technically speaking, CKEditor is performing DOM manipulation on the page contents.

Even if CKEditor uses HTML and the browser DOM as its base for editing, it is possible to input and output any kind of data on it. This data transformation is handled by the so called "Data Processor".

The Data Processor

The Data Processor is an object, in the JavaScript point of view, which transforms the input data into HTML, and back to the output data format later. This transformation is provided by the toHtml and toDataFormat functions, respectively.

CKEditor Data Processor.png

The Default Data Processor: XHTML

CKEditor is distributed with an XHTML Data Processor. This may sound strange, but it transforms the (X)HTML data inputted in the editor into "good" HTML. In this process it also makes some transformations required by some of the editor features.

The XHTML Data Processor is quite a complex, and flexible, piece of program. It's composed by several parts:

CKEditor XHTML Data Processor.png

  • HTML Parser: reads the HTML inputted in the editor, transforming it into "good" html. It also transforms the HTML string into a JavaScript object tree, to be manipulated by "filters". See CKEDITOR.htmlParser.
  • HTML Writer: outputs strings in HTML format based on the object tree representation of the data.

Note that the HTML Parser is used on output as well, when retrieving the HTML from the browser. This is needed because browsers, especially IE, may produce bad quality code.

The default Data Processor can be retrieved and manipulated by the editor.dataProcessor property.

HTML Parser Filters

Filters are one of the powerful features available on the HTML Parser. They're applied to the object representation of the parsed HTML data when transforming it back to strings.

Some of the application of filters are:

  • Remove attributes from elements.
  • Rewrite elements attributes values.
  • Rename element names or attributes.
  • Add missing attributes to elements.
  • Etc...

The default XHTML Data Processor already applies several filters either on input and on output of data. You can extend it by adding custom filter rules (calling addRules()) to the following objects:

  • editor.dataProcessor.dataFilter: filter applied to the input data when transforming it to HTML to be loaded into the editor ("on input").
  • editor.dataProcessor.htmlFilter: filter applied to the HTML available in the editor when transforming it on the XHTML outputted by the editor ("on output").

For example, the following code will ensure that <img> elements will have their "alt" attribute filled:

editor.dataProcessor.htmlFilter.addRules(
    {
        elements :
        {
            img : function( element )
            {
                if ( !element.attributes.alt )
                    element.attributes.alt = 'An image';
            }
        }
    });

Custom Data Processors

Plugin implementers can create data processors, which can transform custom non HTML data formats to be loaded and outputted by the editor. For example, there could be data processors for BBCode, Wiki markup, or any other kind of structured data markup. Even server side transformation could be considered, by executing Ajax style calls from the data processor code.

To make the editor use a custom data processor, it's enough to set editor.dataProcessor to an object implementing the data processor interface, replacing the default XHTML Data Processor.

Custom data processors may use parts of the default XHTML Data Processor code, including the HTML Parser or even the HTML Writer. It's recommended checking the htmldataprocessor plugin code for reference and ideas.

This page was last edited on 20 October 2010, at 16:32.