JavaScript Query Engines Thursday, 9 September 2010

By Garrett Smith, with input from John David Dalton, Scott Sauyet, Andrew Paulos, and a ton of feedback Diego Perini.

Most popular javascript libraries these days have a CSS Selector query engine. The concept originated from CSSQuery and was popularized by jQuery. The idea is to match elements in the DOM based on a CSS selector string.

The W3C Selectors API Level 1, a Candidate Recommendation since 2009, was started in 2006, based on CSS selectors.

CSS2 selectors have been around for over 12 years. The syntax and concepts are easy to grasp and are well known — or are they?

What's the difference between the W3C Selectors API and those found in javascript libraries? They're both based on CSS Selectors, right? Aren't they all about the same?

It turns out they're not. There are many significant differences between CSS Selectors[CSS2] and the CSS Selector query engines defined in javascript libraries.

The differences are explained and demostrated by the library examples.

When considering a Javascript library, it is important to examine the source code by code review in order to make an informed decision about its quality.

Library Examples

The examples demonstrate problems primarily in jQuery, but also in YUI 2, YUI 3, Ext-JS, and Sencha. Listing every bug in each major library would have been too much cover in the already lengthy article.

CSS3 Compliance

To be CSS2 compliant, a CSS Selector Engine must follow the lexical grammar defined in the CSS2 specification to parse selector strings and perform correct matching on elements.

None of the libraries reviewed are compliant with any edition of CSS. The conformance violations are pretty obvious: Broken parsing, incorrect matching, errors thrown on CSS1 selectors and proprietary syntax extensions. These and other problems are explained below.

Although any library author is free to make any design decision he chooses, if the design decisions violate the CSS specifications and drafts (and these do), then the code cannot honestly be said to be CSS3 compliant.

For example, jQuery.com claims CSS3 Compliance and Ext-JS claims "DomQuery supports most of the CSS3 selectors spec, along with some custom selectors and basic XPath". Whether Ext supports more than half of CSS3 selectors depends on the browser; the claim of "basic XPath" support is false (possibly outdated documentation from previous editions which borrowed from jQuery).

What do the Libraries Do?

The current javascript library APIs do not adhere CSS2 selectors[CSS2]. Most implement nonstandard extensions and different behavior for standard selectors. All of them fail to implement many pseudo-classes of CSS1-CSS3. Some match properties instead of attributes for attribute selectors. They all tend to copy each other and the respective documentation of each doesn't always reflect reality. They tend to change substantially between each release, removing support for XPath, redesign with document.querySelectorAll, removing some selectors and adding others. They tend to work differently in IE, depending on the mode. The article will elaborate on cases of these things happening in javascript libraries.

Cross Browser Consistency?

A few superficial tests in this article demonstrate significant problems in the cross browser behavior of these libraries. More inconsistencies are revealed in Diego Perini's Index of CSS selector tests.

The supported browser list in jQuery includes IE 6.0+, FF 2+, Safari 3.0+, Opera 9.0+, and Chrome. However all libraries tested have results within that set of browsers that are inconsistent with the spec, inconsistent between browsers, and in the case of selectors extensions, inconsistent with other libraries.

Problems Overview

Problems in jQuery, YUI2, YUI 3, Ext, and Sencha include:

  1. Broken by Design
    • Fundamentally broken abstractions and browser inconsistency.
    • Native First Dual Approach, or NFD - Inconsistent. Variance based on NodeSelector presence/absence or errors thrown and handled with the library's fallback.
    • Syntax Extensions - nonstandard and inconsistent.
    • Incorrect Documentation.
  2. Broken Parsing
    • Fails to ignore various whitespace in attribute values, as a[name= bar\n\t] (YUI2, Ext);
    • Not throwing errors on invalid, unhandled input. either returning a result that is either empty or contains elements (All: YUI2, YUI3, Ext, Sencha, jQuery).
    • Parse multiple adjacent whitespace as multiple descendant selector. (Ext)
    • Fail to parse certain whitespace in descendant selector (YUI 2).
    • Splitting input on "," — this breaks attribute selectors where the attribute value contains a comma (e.g. "[title='Hey, Joe']"); (Ext, Sencha).
  3. Broken Matching
    • Universal selector mismatches
    • Attributes vs properties mistakes
    • Pseudo-class selectors returning every element or throwing errors
    • Case sensitivity applied to case-insensitive attributes

Problems Details

  1. Broken by Design

    The biggest problems are the design issues, this includes NFD approach and reliance on fundamentally broken abstractions. Some problems such as psuedo-class related bugs are seen in parsing and matching. These bugs cannot be as neatly fixed.

    • Fundamentally Broken Abstractions

      A fundamentally broken abstraction is an abstraction that cannot function consistently across browsers.

      The reviewed libraries' query selector engines are an example of a fundamentally broken abstraction. (Though the problems in the libraries reviewed go beyond the bugs that are seen in the query engines).

      You might be thinking something like:— "Hey, nothings' perfect, right?", or "The libraries have a large user base; they can't be that broken, can they?" or "Can't the bugs be fixed?" Those things have all been said and although fixing the bugs might seem like the right thing to do, in reality it can get complicated.

      For an example of fundamentally broken abstractions, see Sencha Touch--Support 2 browsers in just 228K!, SproutCore--over 20000 lines of new code!, and more of the library excerpts in this article below.

      Dependencies, Consistency, and Change

      Any change to a low-level abstraction propagates to its dependencies.

      Fixing bugs in a low level module creates instability which can break things (like jQuery plugins or widgets). Before attempting to fix any bug, the author must first get an understanding of the problem(s) caused by his code. In the case of a fundamentally broken abstraction, one choice may be to not attempt to fix the bug but to leave it and deprecate the method. If the bug is a core part of the library, it may be possible to refactor the library to not use that method. If that cannot be done (as is the case for the reviewed libraries) then not using the library is probably the best option.

      Most libraries that use a selector engines do so at a lower level. Each bug in library's selector engine propagates to a higher level. If the selector engine's behavior is changed, as by fixing a bug, that change is propagated to all of the higher level dependencies. Such behavioral changes cause instability. The alternative to making such changes (and causing instability) is to not fix the bugs.

      The infamous dojo.isArray is one example of a bug in a low-level abstraction that was not fixed and despite having been pointed out over many discussions over many years on comp.lang.javascript, es-discuss mailing list, and most recently on Ajaxian.com. The problems with the method are that it doesn't work cross-frame, it can return non-boolean values (0, null, undefined), and has a useless statement (typeof it == "array"). The method will, however, have consistent results across browsers.

      However, sometimes an abstraction fails to work consistently across browsers. This is often due to a limitation in a browser.

      The author who has come to recognize such problem in the code is faced with the decision to either attempt to get the abstraction working correctly across browsers, leave it alone, or fix it to work correctly in some cases, while attempting to minimize change.

      Due to the generalized nature of the function, every case cannot be addressed. The result of attempting to make it work is bloat and complexity. The code becomes difficult for the author to understand and clients of the API are confused by inconsistent results, both with various inputs to the function and with versions of the API. The paradox is that if the author does not fix the bugs, then some instability can be avoided, but at a cost of inconsistency between browsers.

      One example of this is the attributes problem in Internet Explorer 7 and below (IE8 fixed most of the issues). Generalized functions that attempt to make IE correctly read attributes are more trouble and effort than it is worth. Instead, the problems of reading attributes can be recognized as a limitation and the design of the system can avoid doing that. By avoiding doing that, the problems associated with doing that are avoided.

      The best solution for such abstractions is to know what you are doing. Do not create them in the first place.

      Any library that relies a broken query selector at the core is just as broken as its query engine. Fixing the query engine bugs causes instability and the browser inconsistencies are unacceptable.

    • Native-First Dual Approach

      The most significant query selector problem is the design approach that I am calling a native-first dual (NFD) approach. NFD creates great inconsistencies between different browsers running the same code. The approach is to first try to use document.querySelectorAll where it supported and where that is either unsupported or where calling it throws an error, a fallback selector matching engine is used.

      Because an error will happen when any proprietary selector is used, the code path taken varies, depending not just on the browser, but on the selector supplied. The library addresses these errors by wrapping every call to document.querySelectorAll in a try / catch. For jQuery, in the catch block, the query selector engine is called oldSizzle. In some cases, oldSizzle will throw an error where document.querySelector all would have returned a result, such as with :focus.

      jQuery's oldSizzle does not support the same input and standard selectors as querySelectorAll. The differences noticeable in the results of simple queries can vary widely across browsers, as seen in the examples further on. Any library that uses NFD (Ext and YUI, among those) will exhibit the same problems.

      Libraries that use NFD include jQuery, YUI 3, and Ext-js, among others. Sencha, which is related to Ext-js, uses a different approach. YUI 2 does not use document.querySelectorAll.

      Native-first Dual Approach Diagram
      <Native QSA Support?>
       Y              N
       |              |
       |              |
      [Try Use QSA]   +--[Use oldSizzle]
        |                /   |  
       <error thrown?>  /  <oldSizzle Supports Input?>
        Y             N/        Y          N
        |             /        |     [Throw error]
        |            /|        |              |           
      [Use oldSizzle] |   [perform match      |
                      |    and return result] |
                      |          |            |
                 [return result] |            |
                      |          |            |
                     END        END          END
        

      Diagram of native-first dual approach. Notice the three different possible endings.

      In addition to the code path variations, native support is buggy. For example, in Internet Explorer 8, <option selected>text</option> isn't matched by [selected] but is matched by [selected=selected].

    • The NFD approach is the most significant and fundamental mistake that a selectors library can make. It is broken by design.

      Syntax Extensions

      Some examples of syntax extensions include variations on what jQuery calls bare words attribute selectors, [att!=val], CSS Style value selectors (in Ext), and even user-defined selectors.

      A W3C-compliant Selectors engine is required to throw errors on any invalid syntax in the selector, such as those extensions defined by jQuery.

      Instead of throwing an error, jQuery interprets [att!=val] as a property selector (described below). How a library interprets the syntax extension is nonstandard, proprietary, and may vary between libraries.

      Ext provides additional syntax extensions to match style values. For example, to match all the elements whose visibility is "inherit", one would use:

        [
          Ext.query("{visibility=inherit}").length,
          Ext.query("{visibility=visible}").length
        ]
      

      That code running on the Ext test page results:

      Internet Explorer (all versions)
      18, 0
      Safari 4, Opera 10.6, Firefox 3.6*, Firefox 2
      0, 18

      The "visibility=inherit" result in Internet Explorer is an array of 18 elements, and 0 in other browsers. This is due to the fact that Ext.query relies on Ext.Dom.getStyle, which checks currentStyle in IE and calls getComputedStyle in other browsers.

      The result is only acheived when there is no trailing whitespace, as "{visibility=visible}", and not "{visibility=visible }".

      *Firefox results with plugins disabled. Some plugins such as Firebug add nodes to the document which will affect the result you see in your browser.

    • Incorrect Documentation

      The documentation for most of the libraries tends to be out of sync with what the code actually does. The most egregious offenders are Sencha and Ext-js.

      This is yet another compelling reason for anyone who evaluating a library to carefully review the source code. The code explains exactly what it does. Does the code clearly reflect what is stated in the documentation?

      If, when examining the source code, it is realized that the code is written obscurely, such as using long methods with high degree of complexity, then it may be best to avoid using the library on that basis because at some point, a part of the application will inevitably need to be debugged and long, complicated methods such as those found in jQuery can be painfully time consuming to step through.

  2. Broken Parsing

    CSS2.1 defines the grammar by which tokens are matched. None of the libraries tested are compliant with that grammar. Most fail in very obvious ways. Some of the problems include:

    • Fails to ignore various whitespace in attribute values, as a[name= bar\n\t] (YUI2, Ext);
    • Not throwing errors on invalid, unhandled input.

      Matches invalid selectors ">>>", "[name=]", "[a >= 2]", "#---", either returning a result that is either empty or contains elements (All: YUI2, YUI3, Ext, Sencha, jQuery).

    • Parse multiple adjacent whitespace as multiple descendant selector. (Ext)
    • Fail to parse certain whitespace in descendant selector (YUI 2).
    • Splitting input on "," — this breaks attribute selectors where the attribute value contains a comma (e.g. "[title='Hey, Joe']"); (Ext, Sencha).

    Fails to ignore various whitespace in attribute values

    Of the tested libraries, jQuery seems to be the only one that is able parse (though not according to standard) and ignore extraneous whitespace in attribute selectors (though it fails to match attribute values properly).

    Given the HTML

      <a name="bar">
    
    And selector string:
      "a[name= bar\n\t]";
    

    YUI2 and Ext will not match the a element.

    Not throwing errors on Invalid, Unhandled Input

    All of the tested libraries allow invalid selectors such as "#---".

    YUI and Ext both fail on the descendant selector, as explained below.

    Ext and Sencha split the input on ",", and so will fail with the basic selector '[title="Hello, user"]'. Of course, it will also fail for any valid Identifier that contains an escaped , as in "#x\\,", which is a perfectly valid selector and works perfectly find when supplied as an argument to document.querySelector.

    The fallback query selector engines in javascript libraries do not follow the lexical grammar defined in CSS2. A library that accepts invalid selectors suffers more problems when it uses an NFD approach, no invalid syntax must be allowed because allowing them creates more possibility for variance (depending on browser, version, selector string, etc, see NFD above).

    Descendant Selector

    A Descendant Selector is two or more selectors separated by whitespace. Whitespace is defined in CSS as: Only the characters "space" (U+0020), "tab" (U+0009), "line feed" (U+000A), "carriage return" (U+000D), and "form feed" (U+000C) can occur in white space. Other space-like characters, such as "em-space" (U+2003) and "ideographic space" (U+3000), are never part of white space.

    YUI 2 and Ext 3.2.1 both fail on the descendant selector.

    Fail to parse certain whitespace in descendant selector

    YUI 2 fails by inconsistently throwing errors with anything other than U+0020 (space). For example, using a tab character, as in "html\u0009body" will, depending on the browser, throw an error with YUI2.

    Parsing multiple adjacent whitespace as multiple descendant selector

    Ext 3.2.1 fails by treating multiple adjacent whitespace as multiple selectors, thus:

    Ext.query("html  body"); // two spaces 
    

    - matches 0 elements, depending on the browser.

  3. Broken Matching

    • Universal selector mismatches
    • Attributes vs properties mistakes
    • Pseudo-class selectors returning every element or throwing errors
    • Case sensitivity applied to case-insensitive attributes

    Universal selector mismatches

    The universal selector, written "*", matches any single element in the document tree (CSS 2.1). The selector is broken in jQuery (see test).

    Attributes vs properties mistakes

    Attributes are string values that the browser parses from the HTML source code. Properties reflect an object's state with any value type (number, boolean, function, etc).

    Most libraries have significant problems with attribute matching, beginning with library progenitor of the confusion: jQuery. These problems are shown below in the jQuery Attributes vs Properties examples

    Psuedo-class problems

    Pseudo-class problems include returning every element or throwing errors inconsistently.

    In an NFD-based library, when the fallback is used, Pseudo-class such as :focus and :active will either return every element or throw errors. For example:

    :link[rel!=nofollow]; // force fallback with custom != selector.
    
    Ext
    TypeError: Ext.DomQuery.pseudos[name] is not a function
    jQuery:
    Syntax error, unrecognized expression: Syntax error, unrecognized expression: link
    YUI 2:
    [] // (empty result)
    YUI 3:
    Error thrown and not caught: name: TypeError, message: methodName is undefined

    These same libraries will all return a match for valid selector syntax :link in a browser that supports document.querySelectorAll because they use the NFD approach.

    The :link pseudoclass is specified in CSS to match all unvisited links. Most browsers that implement NodeSelector for :link match all links, regardless of whether or not they have been visited. This is allowed by Selectors Level 3 Working Draft and is done to prevent scripts from examining a user's history.

    Throwing errors in one browser while returning a match in another is not interoperable. It would be better to either throw an error for :link everywhere or to support :link everywhere by matching on all links.

    Case Insensitive Attribute Values Treated Case-sensitively

    Are attribute values case sensitive or case insensitive?

    The CSS2 specification states:

    The case-sensitivity of attribute names and values in selectors depends on the document language.

    In HTML 4, each attribute definition includes information about the case-sensitivity of its value. Examples of case-insensitive (CI) attribute values include INPUT element's type and name attributes and the FORM element's action and method attributes, among many others. Some case sensitive (CS) attribute values include the global id attribute, and, for the A element, the name attribute.

    Thus, [method=GeT] must match <form method='get'> while [name=Q] would match <input name="q" type="text"> and not <a name="q">.

    To add to the confusion, HTML5 defines a global case sensitivity map that conflicts with what is defined by HTML 4 for some specific element attributes. For example, HTML 5 states that NAME is case-sensitive though in HTML for, a) is case-insensitive for a.

    Browser implementations vary.

    Internet Explorer 8 and below will correctly match the INPUT element's CI NAME attribute value in a case-insensitive manner while many other browsers will not.

    For the most consistent and interoperable behavior, authors are advised to not rely on case-insensitive attribute matching for NodeSelector but to instead supply the case in the selector string as it appears in the source markup.

    Although most libraries account for case insensitivity in element and attribute names, they do not account for case insensitive attribute values.

    While implementations vary, the javascript library query engines pass the variance right on to their callers, providing inconsistent results.

    A javascript library could provide consistent cross-browser results by either

    • supporting no attribute selectors
    • providing a case-sensitivity map.

    NWMatcher provides a case-sensitivity map. However it does not do so on a per element basis, but instead element-agnostically. NWMatcher follows the recommendation from HTML5.

    Not all browsers will follow that case sensitivity map, which is a part of a draft.

    Attribute selectors involve conflicts and interdependencies between working drafts HTML5, and CSS2.1 (PR) the official standard HTML 4.01, and conflicting implementations.

    A program using lower case attribute values except in cases for ID, CLASS, and NAME, can avoid many of the differences, however that doesn't change the problem of a javascript library that runs on a page with form name="F1" and uses a query selector [name="f1"].

    A javascript query library can avoid these problems by disallowing attribute selectors altogether. It can do this by throwing an error on any unsupported syntax. The strategy is explained below.

jQuery Selectors Quiz

Not long before writing this article, I published a quiz of 10 multiple-choice questions of simple, common selectors being applied to 9 lines of HTML. I did this after the jQuery team had tweeted about an article demonstrating invalid selector syntax using jQuery's "bare words" attribute/property selector to match property values instead of attributes. The confusion that jQuery has helped propagate was to blur the distinction between properties and attributes. That confusion was shown in the article, which had espoused such invalid techniques and which jQuery had endorsed.

Tweet

Some Good and Advanced jQuery Techniques - http://bit.ly/dli5EN 9:01 AM May 9th

It seems surprising that the jQuery team would espouse that, especially light of all the attention that has been paid to the broken ATTR function (explained below). My response to that was to question the reader of what that code actually does.

For each correct answer submitted, the quiz-taker is presented with the explanation of why the answer is correct. At the bottom of the quiz are two example documents that display the results of each question.

Example HTML from Quiz

1.   <img width="600" src="logo.gif" id="imageOne" style="display: none" alt="Write less">
2.   <img src="logo.gif" id="imageTwo" alt="Do More!">
3.   <img width="100" src="logo.gif" id="imageThree"  alt="Write less">
4.   <img width="0" src="logo.gif" id="imageFour"  alt="Write less">
5.   <input type="image" width="600" src="logo.gif" id="inputOne" alt="Write less">
6.   <input type="image" width="100" src="logo.gif" id="inputTwo" alt="Do More!">
7.   <input type="image" src="logo.gif" id="inputThree" alt="Write less">
8.   <input type="image" src="logo.gif" width="0" id="inputFour" alt="Write less">

9.   <pre>-</pre>

Test results (standards, quirks).

No tricky edge cases, however as shown in the test, the answers vary between browsers, with surprising results.

The jQuery-tweeted article espouses the use of $('img[width=600]') to get "All the images whose width is 600px". That's different from the W3C Selectors API (draft)[SELECT] specifies.

If the attribute's value were quoted, as img[width="600"], then standard behavior for that query should match img elements whose width attribute is exactly the value "600", never mind if has been rendered at 600px.

In contrast, the Selectors API[SELECT] specifies that an error should be thrown when invalid syntax is supplied. Since 600 is neither a string nor an identifier, the entire selector is invalid. A compliant Selectors implementation must throw an error with that.

The Selectors API Level 1[SELECT] states:

If the given group of selectors is invalid ([SELECT], section 13), the implementation must raise a SYNTAX_ERR exception.

CSS 2.1[CSS2] states:

Attribute values must be identifiers or strings.

And also:

When a user agent cannot parse the selector (i.e., it is not valid CSS2.1), it must ignore the selector and the following declaration block (if any) as well.

Selector Extensions and CSS3 Compliance

Many selector libraries do not throw an error when given invalid selector syntax. Instead, the library interprets the invalid selector as a property selector (described below).

In the case of jQuery being passed an attribute selector, the ATTR function is used to match a property value.

Other libraries will do different things. Some may match attribute values while others do not. None of the javascript libraries are CSS3 compliant.

Before looking at the results of how jQuery handles attribute selectors, some definition of terms is in order.

Attribute Selectors

Standard CSS 2.1 attribute selectors match attributes defined in the source document. Any attribute value must be either a string or an identifier. In CSS2.1, a string is delimited either by single or double quote marks and an identifier is defined:

CSS Identifier

In CSS, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [a-zA-Z0-9] and ISO 10646 characters U+00A1 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit, or a hyphen followed by a digit. Identifiers can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F".

The definition is unfortunately looser than what is defined by the lexical grammar of CSS, which disallows identifier beginning with a hyphen followed by a hyphen, however the libraries don't match either definition (see also CSS WG bug #174).

CSS identifier is also used in class and ID selectors.

CSS 2.1 defines four attribute selectors:

[att]

Match when the element sets the "att" attribute, whatever the value of the attribute.

[att=val]

Match when the element's "att" attribute value is exactly "val".

[att~=val]

Represents an element with the att attribute whose value is a white space-separated list of words, one of which is exactly "val". If "val" contains white space, it will never represent anything (since the words are separated by spaces). If "val" is the empty string, it will never represent anything either.

[att|=val]

Represents an element with the att attribute, its value either being exactly "val" or beginning with "val" immediately followed by "-" (U+002D). This is primarily intended to allow language subcode matches (e.g., the hreflang attribute on the a element in HTML) as described in RFC 3066 ([RFC3066]) or its successor. For lang (or xml:lang) language subcode matching, please see the :lang pseudo-class.

CSS3 Attribute Selectors

[att*=val]

Match element whose "att" attribute value contains the substring "val".

E[foo^="bar"]

Match element whose "att" attribute value begins with the string "val".

E[att$="val"]

Match element whose "att" attribute value ends with the string "val" .

Property Matching

Dynamic object properties can be of any value and reflect the object's state. Matching attribute selectors against properties is nonstandard. This is what jQuery does most of the time.

jQuery Attribute (Property) Selector Syntax Extensions

jQuery defines additional nonstandard extensions, for example, an incomplete list of just two:
[att!=val]
Represents an element whose property att is either undefined or is not val.
:animated
Select all elements that are in the progress of an animation at the time the selector is run.

The :animated selector is inherently coupled to jQuery.

Other javascript libraries copy some of the jQuery selectors but implement them differently. Rather than trying to match against property values, the other libraries match against attribute values in more cases, though still often matching properties in MSIE.

What Does jQuery Do?

jQuery does what the blog article says it does. Well, in a few browsers, and depending on the rendering mode and the CSS that has been applied to the elements. What jQuery does varies widely across browsers.

Bare Words Attribute Values Test

jQuery bare words attribute selector performs property matching in the examples in the article.

'img[width=600]'
Opera 10.5
    imageOne
    imageTwo
Firefox 3.6, Firefox 2, Safari 4, Chrome 4:
    imageOne
    imageTwo
    imageThree
    imageFour
IE6 and IE7 (standards mode), IE8 (EmulateIE7)
(empty result)
IE6 and IE7 (quirks mode), IE8 and IE9 (either mode)
    imageTwo
    imageThree
    imageFour

Cross Browser Results Analysis

The results above show inconsistent results from recent versions of browsers that jQuery supports.

In fact, in IE8 alone, jQuery can result in three possible different results. This is because in IE8, NodeSelector is unavailable in both quirks mode and IE7 mode. Property values can vary between those modes. This leaves the possibility for jQuery attribute selectors to match attributes, or one of two different property values, depending on if the document is in quirks mode.

Had the selector's attribute value been a string (surrounded by quotation marks), as img[width='600'], then following the Selectors API, it must match all img elements whose width content attribute is exactly the value "600".

However, because jQuery uses querySelectorAll first (NFD), img[width='600'] would match img with attribute "600" and for browsers that lack querySelectorAll, will match img elements whose width property is 600.

jQuery Property Matching Example

$("body[ownerDocument]", "html").length

All tested browsers:

    1
$("html body[ownerDocument]").length
Safari 4, Firefox 3.6, IE8, 9, Chrome 4, Opera 10
0
Firefox 2, IE6 and 7 (either mode), IE8 and 9 (quirks mode)
1

The example shows:

  1. jQuery performs property matching of ownerDocument in some browsers
  2. when a context parameter is passed, the property matching occurs in all tested browsers

It is a bad idea to try to read ownerDocument this way, however some might actually think it is a good idea to try to read an input's checked property. — a classic mistake, and one which unfortunately made it into the core of jQuery.

Attributes vs Properties

The basic difference between attributes and properties are that attributes are string values that the browser parses from the HTML source code and properties reflect an object's state with any value type (number, boolean, function, etc).

jQuery has never handled attributes properly[1][2][3][4][5][6]. jQuery is designed in such a way that does not clearly distinguish attributes from properties. The most common versions of Internet Explorer have this same problem.

jQuery/Sizzle ATTR Matcher

The source code for Sizzle shows how object properties, before attributes, are matched.

ATTR: function(elem, match){
    var name = match[1],
    result = Expr.attrHandle[ name ] ?
        Expr.attrHandle[ name ]( elem ) :
        elem[ name ] != null ?
         elem[ name ] :
         elem.getAttribute( name ),
        value = result + "",
        type = match[2],
        check = match[4];

The line:

elem[ name ] != null ? elem[ name ]

- checks to see if the element's property is either null or undefined. If that is the case, getAttribute is used as a fallback.

It would seem to make more sense to use elem.getAttribute instead, however, that would still leave behind problems with MSIE's completely broken implementation of attributes, prior to IE8.

Using getAttribute(att, 2) for IE cannot be used safely because IE throws errors in some cases with that and returns wrong values, such as strange numbers for values of boolean attributes (MSDN).

input.getAttribute("disabled", 2); // Result number 0 in IE.

Properties as attributes appears to have been a fundamental design oversight in early jQuery. Changing the method to use a strategy to resolve attribute values would change behavior with programs that use of jQuery, jQuery UI and any and all plugin dependencies.

So while changing ATTR to match attributes would make sense, it would not be practically possible in IE (due to bugs in IE). IE bugs aside, changes to ATTR would result in a substantial change propagation to any and all dependencies. The problem cannot easily be fixed, as jQuery is a public API and public APIs are forever.

jQuery applies attribute selectors to match object properties, but where querySelectorAll is implemented, and an error is not thrown, jQuery resolves attributes.

Attribute values and properties are completely different things. Performing attribute selector matching by testing elements' property values, as jQuery does, is a significant deviation from the way standard attribute selectors work.

CSS 2.1 Identifier: ID and Class Selectors

The CSS production for Identifier is also for ID and class Selector. ID Selector

The ID selector is "#" followed by an identifier; B&W? is not an identifier and so #B&W? is not a valid ID selector. However in jQuery, the production for identifier is not matched; jQuery will use its native-first dual approach, which throws an error catches that, falling back to oldSizzle.

document.querySelectorAll("#B&W?"); // Error.
jQuery("#B&W?"); // Result of 0 objects matched.

Class Selector

The Class selector is "#" followed by an identifier.

document.querySelectorAll(".B&W?"); // Error.
jQuery(".B&W?"); // Result of 0 objects matched.

Not only jQuery, libraries have problems with these selectors.

Ext-JS documentation uses invalid syntax and misleads the reader by falsely stating:

The use of @ and quotes are optional. For example, div[@foo='bar'] is also a valid attribute selector.

No! The @ in an attribute selector is not a valid CSS Selector Quotes are not optional for CSS selectors. Omitting quotes in Ext (the big one) may result in an error being thrown, depending on if the attribute value is an Identifier and when an error is thrown, the fallback is used.

The same documentation is used for Sencha, and when and invalid query is passed in Sencha, the result is a javascript error. An XPath attribute selector using [@foo='bar'] would cause an error to be thrown in any browser. The difference with Sencha is that error is not caught; no fallback is provided.

Other Libraries: A Peek at YUI, Ext-JS, and "My Library"

YUI 2

YUI 2 supports some jQuery extensions, but for other extensions, and even for some standard CSS selectors, it returns wrong results. For example, ":link" and ":disabled", return every element in the document.

YUI 2 supports only U+0020 white space in selectors and throws errors on anything else. Mootools 1.2 has the same problem, throwing an error if the selector contains whitespace it can't recognize (such as tab). Contrast to what is specified for whitespace in CSS2.

What's Next?

Upon learning that the library creates more cross browser problems than it solves, the next logical step for the library user should be to stop using it, to remove it, and not to jump blindly to another library.

The most logical next step for developers realizing that their query library is failing to live up to what was promised is to learn how to create reusable, forwards-compatible abstractions that follow standards and work consistently across browsers.

For Library Users and Management

There is no substitute for knowledge (Deming). One who wants to build RIAs must read all of the pertinent specifications and all of the pertinent browser documentation. He should have a good foundation of OO principles and methodologies.

The following advice is offered to the reader:

  • RTM - (ECMA, CSS2.1, Selectors, DOM 2 HTML, HTML4, HTML5, also MDC and MSDN).
  • Test across many browsers, including older browsers, to test degradation paths.
  • Get code reviews
  • Ask smart questions

Everyone in the company should be focused on the successful production of a quality product.

If management is making decisions about javascript (libraries or otherwise) and they are not technically qualified to make technical assessments of quality, then they are effectively hurting the company and they need to stop doing that.

For Library Authors

Reusable abstractions are useful to fulfill software requirements quickly. The concept of javascript library must evolve beyond the current state.

Independent, Cohesive parts

The libararies reviewed are highly interdependent.

The javascript programming language allows interface-based design without having to create an actual interface. For example, a method `elementHasClass` is needed, then that method could easily exist independently, and such methods do exist in YUI, for example. There should be no need to depend on the concretion of entire YUI core, just that method and whatever it depends upon.

An interface does not need to "change the way you write javascript. That would be the task of an IoC-type framework.

An interface does not promote dependence on one Big Thing. An interface should do one thing.

Fundamentally Broken Abstractions

A fundamentally broken abstraction cannot function consistently across browsers in all contexts. What can be done about such problems?

One approach is to try to make the abstraction work in all contexts. A few extreme examples of that are in the "APE.dom.getOffsetCoords" function I wrote several years back. Another is David Mark's attribute reading function attr.

As seen in these examples, modules that have more browser-differences workarounds become increasingly complex, have more edge cases, and are and much harder for humans to digest.

To avoid having a linear dependence on any abstraction the object that is using the abstraction can be configured so that the abstraction that it is using can be switched to another abstraction with the same interface.

Speculative Generality

An interface that is built for "users" tends to lead to too much generality, as seen in the popular libraries, burgeoning with fetures, complexity, and bugs.

An interface that addresses differences in browsers should follow standards and use feature testing to derive strong inferences about the client environment and should limit what it does to the least capable environments.

Code Review

Blind acceptance of library code caused the Ajax library problem. Too many awful APIs, too much misinformation and the result is a catastrophe.

To avoid the mistakes, libraries will need more peer review, and that starts with you, reader. The next time you want to evaluate a library, look carefully at the source code.

Technical management (at least in America) thinks that they can get away with copying what everybody else is doing but they don't realize that this is hurting quality. Ajax development has become an extreme case of the blind leading the blind.

Focus on Quality

Teams that focus on short term costs and "getting things done" sacrifice quality. Sacrificing quality creates technical debt. Technical debt hurts quality and increases the effort (and cost) of maintaining the code.

The most important step that a company can take is to focus on fulfilling its goals with quality solutions. A company that focuses on quality will, in the long run, reduce the costs of production.

You can not afford to do things wrongly.

Cross Browser Abstractions - Wrapper with a Fallback

What follows is a strategy for developing a consistent interface, limited by the least common denominator.

The strategy uses a mixture of standard features, where those are available, and compatible fallbacks where they are missing or found buggy (by capability tests).

Although the example of the concept is about Selector queries, the conceptual pattern and strategy itself is applicable to many situations.

Filter the Input

A query selector that behaves consistently across browsers must verify that all inputs behave consistently across all known implementations.

The library can decide which selectors will be unsupported and filter them out. Some selectors, such as :visited, are not possible to implement and not very useful anyway. Supporting attribute selectors for IE is more trouble than it is worth.

Using NodeSelector

If the library chooses to use NodeSelector, then it must follow the specification to the letter. It must not allow any invalid selectors. It must not extend the CSS2 selectors syntax.

Capability Tests

To determine if the browser provides a sufficient implementation of document.querySelectorAll, perform capability tests to check for not only existence of document.querySelectorAll, but known problems with the supported selectors[13].

For example, a a CSS1-compliant query selector engine could employ a strategy where if the selector did not match a validity constraint, then an error would be thrown.

function makeQuery(selector, doc) {
    if(!isValidSelector(selector)) {
        throw new InvalidSelectorError(selector);
    }
    doc = doc || document;
    if(IS_QSA_SUPPORTED) { 
        return doc.querySelectorAll(selector);
    } else {
        return makeQueryFallback(selector, doc);
    }
}

Consistently throwing an error for unsupported selectors avoids the inconsistencies seen with the native-first dual approach.

The library function could safely use doc.querySelectorAll as a fallback, so long as the known implementations consistently support all of the selectors that the library supports[12]. And, in case you didn't notice, this strategy will work cross-frame, unlike every other selector engine.

  <Is input Supported?>
   |           |
   Y           N- [Throw Error] -END| 
   |              
  <Native QSA Support?>
   |               |
   Y -[Use QSA]    N - [Use falback]

This is somewhat similar to the strategy used by Dojo, which documents that some selectors are unsupported.

The Value of Queries

The selector APIs in the several prominent javascript libraries reviewed are so broken that they obviously cannot be relied on.

The value of selectors that work as specified by the specifications has not been established.

There are several alternatives to using selectors. Anyone, though especially those who are using a broken selectors API, should question how much value the abstraction provides. He should compare that value, positive or negative, to the alternatives.

Drawbacks to Queries

The program design approach of using DOM traversal to select nodes and then performing an action on one or more of them is usually much less efficient than standard alternatives.

DOM traversal is performed on page load, it can cause the page load to seem slower, especially if the action inside the loop triggers a recalc (also commonly called "reflow"). For a large document, thus can can cause the page to become unresponsive, as seen on the WHATWG HTML5 draft specification "full version" [14]. Such performance issues are more likely to affect slower systems, not fast developer systems.

Query Matching Strategy

Most usage of queries don't allow for common traversal patterns of finding an ancestor. Such traversal pattern is often needed when using event delegation strategies, where the callback needs to know find an ancestor matching a particular criteria, usually either ID, className or tagName.

var sel = new Selector("ul.panel");

function clickCallback(ev) {
  var target = DomUtils.getTarget(ev);
  if(sel.test(target)) {
    panelListClickHandler(ev);
  }
}

To handle this functionality, the Selector.test method could use Element.matchesSelector(txt) (after capability testing, of course). This is implemented in Gecko as Element.mozMatchesSelector and in webkit as Element.webkitMatchesSelector.

Since selector traversal and parsing is slower, another alternative would be to support only simple selectors but without attributes, so limiting to type (element), class, and ID.

Alternatives

The "find something, do something" approach has efficient alternatives alternatives.

If the "do something" action is adding an event handler to various nodes, then that can action be replaced by using event delegation. This is done by adding an event listener to a common ancestor.

If the "do something" action is modifying styles, then the script can add a className token to a common ancestor of the matched nodes, allowing the browser to apply the cascade to descendant nodes. An example of this is linked from the design section of the code guidelines for comp.lang.javascript.

Conclusion

The current javascript library APIs do not adhere the CSS2 specification for selectors.

Library documentation for selectors often does not differentiate between standard selectors or nonstandard extensions. For libraries that use the NFD approach, query results vary widely not only between browsers, but even in the same browser, depending on something as trivial as an unquoted attribute value.

The native-first dual approach does not normalize browser behavior. Instead, amplifies the differences between browsers that have native support and those that don't.

Design problems are not limited to the query engines, but include other parts of the library and extend to their dependencies. Other such design problems seen in libraries include browser detection, fake method overloading, and useless methods that don't do what their name indicates. All at a cost of increased bytes and instability.

Any javascript developer who uses jQuery, YUI, Ext-JS, or Sencha either has not read the source code enough, is not capable of understanding the problems, or has read and understood the problems but has not appreciated the consequences deeply. The use of any one of these libraries is a substantially irresponsible and uninformed decision that puts quality and long-term success of his project at risk.

Today's Ajax libraries are interdependent monoliths that promise what is not practically possible. The problems with javascript libraries can be avoided by favoring simple interface-based design that avoids browser issues.

References

Specifications and Drafts

Other References

Tests

Posted by default at 12:15 AM in JavaScript

Detecting Global Pollution with the JScript RuntimeObject Sunday, 11 April 2010

This article is about debugging with JScript's RuntimeObject (msdn). All of the examples work in IE 5.5+, though most do not work in any other browser.

Leaked Global Identifiers

Say you accidentally created a global property, as in the following:

function playRugby(players) {
  var items,
      i;
      len = items.length; // Global.
}

function kick() {
  var x = 10
      y = 11; // ASI makes y global.
}
When playRugby is called, a global property len is created, if it does not already exist, and then assigned the value of items.length. Likewise, when kick is called, a global property y is created.

These globals are unintentional. They break encapsulation and leak implementation details. This can result in conflict and awkward dependency issues.

To detect accidentally created global identifiers, we can loop over the global object using for in. Firebug provides this convenient global inspection under the "DOM" tab.

Everybody's Favorite Browser

Unfortunately, in IE, the for in won't enumerate any global variables or function declarations, as seen in the example below.

Example Enumerating the Global Object

// Property of global variable object.
var EX1_GLOBAL_VARIABLE = 10;

// Property of global object.
this.EX1_GLOBAL_PROPERTY = 11;

// Property of global variable object.
function EX1_GLOBAL_FUNCTION(){}

(function(){
  var results = [];
  for(var p in this) {
    results.push(p);
  }
  alert("Leaked:\n" + results.join("\n"));
})();

The result in IE contains a mix of window properties and one the four user-defined properties: EX1_GLOBAL_PROPERTY.

So what happened to the other three user-defined properties? Why didn't they show up in the for in loop?

It turns out that enumerating over the global object will enumerate properties assigned to the global object and will not enumerate global variables.

An educated guess as to why global properties are enumerated but global variables are not might be that JScript gives global variables (declared with var), the DontEnum flag. Since the global object is specified as being the global Variable object, this seems like a likely explanation. It would be nonstandard, but it would explain the behavior in IE. Eric Lippert, however, provided a different explanation: The global object and the global variable object are two different objects in IE.

According to MS-ES3:

JScript 5.x variable instantiations creates properties of the global object that have the DontEnum attribute.

Enumeration Solution: The JScript RuntimeObject

To enumerate over global properties, use the JScript RuntimeObject method. Instead of enumerating over the global object, as you would use in a normal implementation, enumerate over an object returned by the global RuntimeObject method.

var GLOBAL_VAR1, 
    GLOBAL_VAR2, 
    GLOBAL_VAR3 = 1; 
    GLOBAL_PROP1 = 12;

function GLOBAL_FUNCTION(){}

if(this.RuntimeObject){
    void function() {
        var ro = RuntimeObject(),
            results = [],
            prop;
        for(prop in ro) {
            results.push(prop);
        }
        alert("leaked:\n" + results.join("\n"));
    }();
}
IE Result

The result in IE 8 and below includes (among other things, including window) GLOBAL_FUNCTION, GLOBAL_VAR3, and GLOBAL_PROP1, in that order, as they were evaluated in. Notice that neither GLOBAL_VAR1 nor GLOBAL_VAR2 were included. It appears that RuntimeObject does not accumulate any variables that were unassigned to. According to Microsoft's documentation, this is not the specified behavior (more on this below).

Microsoft RuntimeObject Documentation

The JScript RuntimeObject is a built-in extension to JScript. JScript defines seven additional built-in global methods: ScriptEngine, ScriptEngineBuildVersion, ScriptEngineMajorVersion, ScriptEngineMinorVersion, CollectGarbage, RuntimeObject, and GetObject. These objects are all native JScript objects, not to be confused with host objects.

For RuntimeObject, Microsoft JScript Extensions [MS-ES3EX] states:

The RuntimeObject function is used to search a global object for properties with names that match a specified pattern. The function only locates properties of the global object that were explicitly created by VariableStatement or FunctionDeclaration functions, or that were implicitly created by appearing as an identifier on the left side of an assignment operator. The function does not locate properties that were created by means of explicit property access on the global object.

Superficial testing indicates that Microsoft's documentation is wrong.

The returned object does not includes all identifiers that were added to the Variable object; only those identifiers that have been assigned a value. Whether or not they were created from VariableDeclaration, FunctionDeclaration, or assignment as global properties does not matter.

Example of Finding Identifiers Created By FunctionBindingList

All identifiers in a FunctionBindingList of a JScriptFunction will become properties of the containing Variable object, so, for example:

var foo = {}, undef, ro;
(function(){ function foo.bar, baz(){} })();
ro = RuntimeObject();
alert([ro.foo.bar, "undef" in ro].join("\n"));
IE elerts
function foo.bar(){}
false

Browsers other than IE running JScript can be expected to throw SyntaxError upon parsing the FunctionBindingList of JScriptFunction production. This is to be expected, as it is a syntax extension.

Bookmarklet

As a bookmarklet:
javascript:(function() {var ro=RuntimeObject(),r=[],i=0,p;for(p in ro){r[i++]=p;}alert('leaked:\n'+r.join('\n'));})();
JScript Syntax Extension

The earlier example "Finding Identifiers Created By FunctionBindingList" mentioned the JScript Extension JScriptFunction. In case the name is not a dead giveaway, this is a JScript language extension. The production for JScriptFunction is:

JScriptFunction : 
function FunctionBindingList ( FormalParameterListopt ) { FunctionBody }
RuntimeObject(filterString): The filterString Parameter

The RuntimeObject method accepts an optional filter string to match identifiers. Unfortunately, filterString is not converted to a regular expression but is used for substring matching with optional leftWild and rightWild, defaulting to *.

This means that, for example: filterString = "a*" would match identifiers a and a1 but not ba.

Conclusion

Documentation bugs and shortcomings aside, the RuntimeObject provides a useful alternative to the problem of enumerating global properties in JScript. An advantage with RuntimeObject is that it only includes user-defined properties, with the exception of the global window property.

The aforementioned bookmarklet provides a convenient way to check a page to see the globals that have been accidentally created (it also shows that this site is not a shining example of keeping the global object clean).

Other Applications for RuntimeObject

Cross Browser Identifier Leak Bookmarklet

Writing a cross-browser identifier leak detector is the next logical step to an IE-only identifier leak detector.

Automated Identifier Leak Detection

Checking for accidental global identifiers should be automated.

The YUI Test unit test framework provides hooks for TEST_CASE_BEGIN_EVENT and TEST_CASE_COMPLETE_EVENT . These events can be used to inspect the RuntimeObject and catch global identifier leaks that occur througout the runtime execution of program code.

In TEST_CASE_BEGIN_EVENT, inspect the RuntimeObject and save the result. In TEST_CASE_COMPLETE_EVENT, inspect the RuntimeObject again and compare the results with results saved during TEST_CASE_BEGIN_EVENT. Next, for each property that appeared in TEST_CASE_COMPLETE_EVENT but was not present in the result saved from TEST_CASE_BEGIN_EVENT , a global identifier has been leaked and a test case warning can be logged.

References

  • [MS-ES3EX]: Microsoft JScript Extensions to the ECMAScript Language Specification Third Edition.
Posted by default at 4:23 PM in Browsers

Ebay Facilitates Fraud Thursday, 18 March 2010

Won an auction of eBay for:

"You Won eBay Item:DELL E1705 INTEL DUO 2 GHZ, WINDOWS VISTA ULTIMATE (290295125189)"

I got that laptop nearly two weeks after I paid for it.

Shortly thereafter, I got it in to a local tech who, after about four days, informed me that the copy of windows that was installed was unlicensed.

It turns out MS Office, which was also advertised as being included in the auction, is also pirated.

Paypal Claim

I immediately filed a complaint with the payment system, Paypal "Item significantly not as described". The claim was requesting the seller to provide a license key for Windows.

The seller could not provide a license key for windows because he does not have one. He decided to make the irrelevant excuse "I lost the CD" to paypal, and somehow, that worked.

On March 20, 2009, I receieved the email from paypal stating:

We will notify you if further action is required.

I called paypal and the representative said I would receive contact from paypal by April 17 (IIRC), however that never happened and Payapl closed the claim on May 24, 2009, notifying me with the following:

Paypal Case Details

We have concluded our investigation into this case. Unfortunately, 
at this time we are unable to decide this claim in your favor.

-----------------------------------
Case Details
-----------------------------------


As you can see, there are no "case details" there; the entire section is completely blank.

As soon as I received this, I called Paypal to ask why the case had been closed. The Paypal representative could not provide a reason, nor could he state what, if anything, paypal had done. He reopened the case and stated that a Paypal representative would contact me.

Paypal did not contact me, but instead closed the case again.

I went through the process of calling Paypal again, waiting on hold for a long time, and again talking to a rep. The case was reopened, and then once again re-closed, without any evidence that paypal had actually done anything to investigate the claim.

I have saved all of the email communications with Paypal. The responses from Paypal indicate that the paypal representatives failed to read my messages and failed to read the auction title which was included in the message.

Paypal has provided no evidence to having done anything to investigate my claim.

Back to Ebay

After Paypal failed, I called ebay. I made many calls to ebay, each time having to restate everything from the beginning. I received follow-up email from rswebhelp@ebay.com stating that I should contact the police, file a complaint with IC3, and file a mail fraud complaint with USPS.

Ebay Action

Like paypal, Ebay ignored many, if not most of my emails. Of the emails that I received a reply to, the responses do not include answers to the questions. They are a top-reply of mostly irrelevant parroting of what appears to be copy'n'pasted information on how to call the police, file a claim with the post office, or contact IC3. I have done all of those things.

Ebay stated that they work closely with the police. I provided rswebhelp with the police report number and requested for them to call the police but the request was ignored. What ebay says and what ebay does do not match.

The San Francisco Police officer I reported to told me that the case would be only paperwork for them. He would not even look at the URL. I cannot force them to change how they operate.

The Seller

I followed up with Dave Kaercher by sending email and by calling. The phone message I left was not replied to and the email (all emails I have sent him) was ignored. I did get through a week later and told Dave the problem and that I was seeking a refund of money I paid. Dave said "we don't have a deal" and that was the end of the conversation.

The computer has not provided me with the use that could be expected out of a computer with legal, licensed versions of Windows Office.

Instead of using Windows, I have been hobbling with a cracked OS for 1 year. The OS frequently restarts (in failed attempts to run important security updates).

Instead of the expected use of Microsoft Office, Launching Office errs with: "

Microsoft Office Genuine Advantage

  [logo] This copy of Microsoft Office is not genuine.
  Please excuse this interruption. This copy of Office did not 
  pass validation. Click Learn More for online details and help 
  identifying the best way to get genuine Microsoft Office.
              [Learn More] [Remind Me Later]

Clicking "Learn More" leads the web page: Genuine Microsoft Software

I have actively pursued this for over a year, with several emails and calls to ebay, paypal, and Dave Kaercher. I am posting the seller's personal information on my site.

Dave Kaercher

Dave Kaercher: I told you I would do this. I clearly requested licensed copies of what I paid for. I stated this in emails to you and in the paypal claim. You made excuses and ignored those.

I told you two weeks ago by email (which you ignored) and by phone call two weeks ago that I would post your information on my site.

Dave Kaercher's Personal Info

Dave Kaercher GRI, QSC User ID: redbirds04 Name: Dave Kaercher website: wesellmore.com City: Colorado Springs State: CO Country: United States Phone: (719) 282-1681

Lesson Learned

Don't buy things off eBay!

EBay knowingly facilitates fraud. EBay makes money by helping criminals defraud consumers and as such, is guilty of fraud.

Dave Kaercher is in good standing with ebay, probably defrauding other victims. He continues to promote auctions for various things including fake steroids over the past year. It is clear that eBay does not care to take this seriously.

Anyone shopping for items should consider that what is sold on ebay might be illegitimate (pirated, counterfeit, etc). In the case that the item is illegitimate, ebay probably won't do anything about it.

Technorati Tags:

Posted by default at 11:00 AM in Uncategorized

Myopia and the Opera 10 User Agent String Friday, 29 May 2009

Opera has conceived a silly tactic planned for Opera 10 user-agent string.

The problem is that there are scripts that expect the browser major version to single-digit and will fail if it is not.

Since "10" not a single digit, these scripts fail.

Opera has mitigated that problem by changing the user-agent to 9.80 and publishing the following warning:

Browser sniffing ? unless you?re writing a web stats application ? is always a bad idea. It?s a misguided attempt to send different content to different user agents. This is never scalable ? you can?t change every website you?ve ever made every time a new browser version comes out. It is also not future-proof, as highlighted by this article.

Ineffectual and meaningless little blurb there. Those badly written sites that used (poor) browser detection will not break from Opera. Opera spoofing their own user-agent string helps reaffirm the misconception that the authors' browser detection worked. Posting up a little warning that not everyone will read does not make an example.

The blurb states that Browser detection is used "to send different content to different user agents". Not always true. In fact, browser detection is more often used on the client to work around an perceived incompatibility. Since Opera is wrong on that count, it makes the blurb seem even less relevant, as an author who read it might still try to justify or rationalize his approach by saying "but that's not why I used browser detection."

Browser detection scripts cause forwards-compatibility and maintenance problems. However, to not be able to parse out a number is not only not smart, it shows very poor coding skill.

Opera states that version 11 will have "11" in the user-agent string.

My opinion is somewhat in line with Doug's on this one (that Yahoo 360 URL is an awful URL).

If you are a developer, check your code. It really isn't hard to do this stuff correctly. It really isn't.

Where I disagree with Doug is "Opera has been forced to lie."

Opera developers made a decision to lie, as explained by Opera. They were not forced.

An alternative to that choice is for Opera to not cater to badly authored pages and simply let them break.

Breaking sites is bad in the short term because it renders pages unusable. However, it is good in the larger scheme of the web in the long run. By driving home a hard lesson, Opera could teach developers to not use browser detection by providing an historical lesson.

The first sensible opinion on the matter was Hallvord's post from December, 2008., where he pointed out that Bank of America and Live.com failed in Opera 10. The entry describes the reason: Faulty parsing of the User-Agent string, and redirecting to the "not supported page".

You'd think that with the intense development Microsoft has been lavishing on live.com they would have found somebody capable of writing a usable browser sniffer (or ideally a person clever enough to say "wait, we don't really need one - what if we just use feature detection instead?"). Think again..

Of course, Microsoft has been advocating detection "best practices" for years, despite well reasoned arguments to stop doing that (G. Talbot, T. Zijdel).

Opera should be less myopic and stop worrying about breaking badly authored sites. Web developers should be less myopic, and build maintainable, forwards-compatible solutions.

Posted by default at 12:06 AM in Browsers

Function.prototype.bind Thursday, 11 September 2008

Function Binding

A bind function wraps a function in a closure, storing a reference to the context argument in the containing scope.

This allows the bound function to run with a predetermined context.

Variable this

When a function is passed as a reference, it loses its base object. When the unbound function is called, the this value is the global object.

How Binding Works

By storing a reference to the desired object in a closure, this argument can be bound.

In it's simplest form, bind looks like:-

Function.prototype.bind = function(context) {
  var fun = this;
  return function(){
    return fun.apply(context, arguments);
  };
};

Why Binding is Useful

Binding is often necessary when passing function references. For example:-

var updater = {
  fetch : function() {
    alert(this.time++);
  },
  time : 0
};
// setTimeout(updater.fetch, 500);
setTimeout(updater.fetch.bind(updater), 500);

The commented-out call to setTimeout would result in a call to updater.fetch with the global object for the this argument. this.time would be undefined, and this.time++ would result in NaN.

A bind function that does only binding accomplishes a trivial task. In most cases, a closure can just be used where binding is needed.

Binding in the Wild

Most JavaScript libraries handle binding internally. These libraries also include a partial apply for their bind function.

Partial Apply

Partial application is setting parameter values of a function call before it is called. A partial apply function usually looks like:-

/** 
 * Return a function that prepends the 
 * arguments to partial to this call and 
 * appends any additional arguments.
 */
Function.prototype.partial = function() {
  var fun = this,
      preArgs = Array.prototype.slice.call(arguments);
  return function() {
    fun.apply(null, preArgs.concat.apply(preArgs, arguments));
  };
};

This allows us to program in a dynamic, functional, less OO way. For example:

function setStyle(style, prop, value) {
  return style[prop] = value;
}

// Create a setBgColor function from 
// partial application of setStyle.
var setBgColor = setStyle.partial(document.body.style, "background");

// Change the body's background color.
setBgColor("red");
setBgColor("#0f0");

Disadvantage

Partial application requires an extra function call, plus a call to concat for the extra arguments.

Partial application can make debugging trickier, since there is an extra layer of indirection to the real method.

Bind + partial apply can be used to force the this argument of a prototype method to always be the of the instance. This is inefficient and often leads to messy, tangled function decomposition. Libraries that bind every method do so out of ignorance of the language, and are best avoided.

EcmaScript New Language Feature

The forthcoming version of EcmaScript (now called EcmaScript Harmony) will include Function.prototype.bind(context). A native bind should outperform any other bind function.

This was something Peter brought up for EcmaScript 4, but appears to be making way into the revised EcmaScript Harmony.

Mark Miller wrote out a "self-hosted" version of EcmaScript's proposed Function.prototype.bind. It is:-

Function.prototype.bind = function(self, var_args) {
   var thisFunc = this;
   var leftArgs = Array.slice(arguments, 1);
  return function(var_args) {
    var args = leftArgs.concat(Array.slice(arguments, 0));
    return thisFunc.apply(self, args);
  };
};  

After looking at the ES Harmony proposal, and looking at a few versions of bind functions, I decided to write a better one that does exactly what the ES Harmony's bind does, but with greater efficiency than the current libraries offer, and whose, length property was 1.

Although unnecessary, this is a welcome addition to the language. A native bind will outperform any user-defined bind function and will result in fewer closures.

The Rundown

Before I give a critique and rundown, I have a test.

Library Comparison Test

  1. Garrett's Bind

    This bind was, by far, the most efficient in tests #1, and #4, and nearly ties Base 2 in test #2 (Base 2 was about .5 ms faster) . This function requires no additional code or functions.

  2. Base2 bind

    Second performance-wise.

    Requires only a top level _slice function (trivial), and performs a strategy for extra arguments.

  3. Dojo's hitch

    Dojo was fast with pure bind, but slower with partial apply.

    This function requires many other functions and has an additional complication of accepting strings and arguments of different order.

  4. Ext-js Function.prototype.createDelegate

    Performance was slow. This function requires no importing of external functions.

  5. Mark Miller's bind

    Requires no external dependencies. While it gets a 10 for simplicity and aesthetics, this function was not as fast, and for pure bind (no partial apply) was not nearly as fast as it should be.

  6. Prototype's Function.prototype.bind

    Performance was fair. Requires several extra functions + browser detection. The function is used very heavily internally.

  7. Mootools Function.prototype.bind

    Performance times were poor and the results for two tests were wrong. were wrong.

    The entire mootools.js is required for the bind function. The library adds a $family property, and makes other changes to Array.prototype.

  8. YUI 3 bind

    Performance time was fair. The results for test#2 are wrong because the call's arguments are prepended, not appended. This is by design.

    Having the call's arguments prepended is an unusual design decision. It might seem unintuitive, and confuse developers who are used to the more common version, as seen in Prototype, Base2, and the official EcmaScript proposal.

    The amount of code YUI's bind depends on is staggering.

Pass the Parmeźan

Many of the libraries have long chains of function calls. A bind function does not need and should not require the inclusion of several other functions.

Prototype JS requires the $A function, which requires Prototype.Browser to determine which $A function. Browser detection has absolutely no place in Function binding. ($A also calls toArray conditionally, but that will not happen in this case.)

Dojo is almost as bad. Dojo's hitch function has the arguments in reverse order and requires dojo.global, dojo._hitchArgs, dojo._toArray, and dojo.isString.

Mootools has very strewn code. The broken bind function required an additional 108 lines of Mootools.

YUI is the most grandiose. Function YUI.bind(f, o) is found in "oop.js". File oop.js requires over 120k of "prerequisite" files in yui.js and yui-base.js, coming to a total of over 160k. Just for bind. YUI 3 seems to suffer from over-engineering and BUFD, which is typical in waterfall shops.

YUI's 'array' module did not seem to load or evaluate properly, so code from the yui-base file was copy-pasted.

Worth Using?

The best bind functions are fast, do not require other library functions, and are fairly simple.

But is a Bind Function Necessary?

No. Binding can be achieved with an inline closure where it is needed and partial application is not necessary.

Here is example 1, without using bind.

var updater = {
  fetch : function() {
    alert(this.time++);
  },
  time : 0
};
setTimeout(function() { updater.fetch(); }, 500);

A native Function.prototype.bind will allow for cleaner binding, without the need for creating a closure. As native code, it will be faster and more reliable. Function.prototype.bind is not necessary, but is a welcome addition to the language.

Why All the Fuss?

Being aware of what libraries do and identifying and learning from the mistakes of libraries helps developers avoid such mistakes by learning what the library does. Developers do not need the burden of large, tangled, and often buggy library dependencies.

Look at The Code

What the library's bind function does and how the library uses that function internally is a step to take in assessing the library's quality.

A developer can make a more responsible and professional choice by avoiding a library that makes heavy use of a slow-spaghetti bind function.

My Version

The following function is the fastest bind function for pure bind (no partial apply). It is more than three times as fast as Base2, the second fastest bind function tested here.

Here is my version of the bind function.

/**
 * @param {Object} context the 'this' value to be used.
 * @param {arguments} [1..n] optional arguments that are
 * prepended to returned function's call.
 * @return {Function} a function that applies the original
 * function with 'context' as the thisArg.
 */
Function.prototype.bind = function(context){
  var fn = this, 
      ap, concat, args,
      isPartial = arguments.length > 1;
  // Strategy 1: just bind, not a partialApply
  if(!isPartial) {
    return function() {
        if(arguments.length !== 0) {
          return fn.apply(context, arguments);
        } else {
          return fn.call(context); // faster in Firefox.
        }
      };
    } else {
    // Strategy 2: partialApply
    ap = Array.prototype,
    args = ap.slice.call(arguments, 1);
    concat = ap.concat;
    return function() {
      return fn.apply(context, 
        arguments.length === 0 ? args : 
        concat.apply(args, arguments));
    };
  }
};

This function was not included in APE because it was not needed. This function may be used by libraries who wish to continue using a bind function with the benefit of faster performance.

Technorati Tags:

Posted by default at 5:20 PM in JavaScript

 

*AnimTree
*Tabs
*GlideMenus
*DragLib