Clean up HTML on paste in CKEditor

We use CKEditor at FiveFilters.org for our PastePad service. The idea is to allow users to paste content that’s not currently publically available on the web for processing with one of our web tools. This can be content that’s in a Word document, an email, or behind a paywall.

CKEditor can automatically clean up HTML it identifies as coming from MS Word, but there’s no way to force cleanup on all pasted content. By default, HTML cleanup occurs in the following two cases:

  1. User clicks the ‘paste from word’ toolbar icon
  2. User pastes content copied from MS Word itself

In the second case, CKEditor looks for signs of MS Word formatting. It does this by testing whatever you paste against the following regular expression:

/(class=\"?Mso|style=\"[^\"]*\bmso\-|w:WordDocument)/

If there’s a match, it will be cleaned up. Otherwise it will paste as normal.

I want to avoid editing core files, so my solution is simply to ensure that this regular expression always matches pasted content. Here’s what I’ve come up with:

CKEDITOR.on('instanceReady', function(ev) {
    ev.editor.on('paste', function(evt) {    
        evt.data['html'] = ''+evt.data['html'];
    }, null, null, 9);
});

I haven’t tested extensively, but this appears to work as expected (CKEditor 3.6.2). You can try it out.

What the code does is it registers a new listener for the paste event, just like the Paste from Word plugin. When it receives the pasted HTML, it simply prepends an HTML comment containing one of the strings the Paste from Word plugin looks for. The listener has a priority of 9 to ensure it runs before the plugin which will trigger the actual cleaning (default priority of 10).

Note: I posted this solution on StackOverflow as an alternative to another solution, titled “CKEditor – use pastefromword filtering on all pasted content.” StackOverflow recently deleted some of my answers (and hid them from me) so I’m moving the rest of my meagre contributions over to my own blog.

This entry was posted in Code. Bookmark the permalink. Trackbacks are closed, but you can post a comment.

10 Comments

  1. Expanism
    Posted 23 November 2012 at 2:00 pm | Permalink

    Thanks for this post, nice and short code 🙂 should I paste this in config.js and if so where precisely in the .js file?

  2. Posted 1 December 2012 at 4:38 pm | Permalink

    Hi Expanism, if you look at the link to our PastePad page, you’ll see we do this on the page itself, in a script element in the HTML header:

    $(document).ready(function() {
      CKEDITOR.on('instanceReady', function( ev ) {
        ev.editor.on( 'paste', function( evt ) {    
          evt.data['html'] = '<!-- class="Mso" -->'+evt.data['html'];
        }, null, null, 9);
      });
    });

    You could also probably put this in config.js, but I didn’t try.

  3. Posted 22 January 2013 at 10:34 pm | Permalink

    Is it possible to force paste from word without the prompt to tell the users to okay or cancel? I just want it to clean up for every case.

  4. Posted 23 January 2013 at 2:51 pm | Permalink

    Yes, that’s what this bit of code does – applies the cleanup to everything without prompt.

  5. Posted 25 February 2013 at 6:51 pm | Permalink

    I mean, say I want to do something like this:

    evt.editor.execCommand(‘RemoveFormat’, evt.data.html);

  6. Jordan
    Posted 14 March 2013 at 11:37 am | Permalink

    This is one of those rare so-simple-it-works solutions, and also one that required coming at the problem from a very clever route.

    Thank you! I have honestly been searching for solutions and testing off and on for over a year now.

  7. Posted 30 September 2013 at 8:42 pm | Permalink

    This will remove garbage and some other things. You can add this to your ckeditor config file.

    CKEDITOR.on('instanceReady', function(ev) {
    ev.editor.on('paste', function(evt) {
    evt.data.dataValue = evt.data.dataValue.replace(//g, '' );
    evt.data.dataValue = evt.data.dataValue.replace(/ /g,'');
    evt.data.dataValue = evt.data.dataValue.replace(//g,'');
    console.log(evt.data.dataValue);
    }, null, null, 9);
    });

  8. Posted 11 July 2014 at 3:48 pm | Permalink

    Very useful code snippet, but unfortunately I have found a feature/buglet, I believe. Basically when using the “paste as plain text” icon in CKEditor, one gets a “cleansing” popup dialog to ctrl-V into. When one pastes copied text into this now, with the above code, “undefined” get inserted into original text as opposed to pasted text.

    Thoughts on why this might be happening?

    Thanks,

    Ed

  9. Robert
    Posted 1 March 2015 at 8:27 pm | Permalink

    This doesn’t seem to work in Ckeditor 4+

  10. Posted 13 October 2016 at 10:27 pm | Permalink

    For ckeditor 4+:


    CKEDITOR.on('instanceReady', function(ev) {
    ev.editor.on('paste', function(evt) {
    evt.data.dataValue = '<!--class="Mso"-->'+evt.data.dataValue;
    }, null, null, 2 );
    });

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>


Warning: Illegal string offset 'solo_subscribe' in /home/keyvan/public_html/wp-content/plugins/subscribe-to-comments/subscribe-to-comments.php on line 304

Subscribe without commenting