Chunk-based JSON parsing and generation in Objective-C.
Overview
SBJson’s number one feature is stream/chunk-based operation. Feed the parser one or
more chunks of UTF8-encoded data and it will call a block you provide with each
root-level document or array. Or, optionally, for each top-level entry in each
root-level array.
With this you can reduce the apparent latency for each
download/parse cycle of documents over a slow connection. You can start
parsing and return chunks of the parsed document before the full document
has downloaded. You can also parse massive documents bit by bit so you
don’t have to keep them all in memory.
SBJson maps JSON types to Objective-C types in the following way:
JSON Type
Objective-C Type
null
NSNull
string
NSString
array
NSMutableArray
object
NSMutableDictionary
true
-[NSNumber numberWithBool: YES]
false
-[NSNumber numberWithBool: NO]
number
NSNumber
Booleans roundtrip properly even though Objective-C doesn’t have a
dedicated class for boolean values.
Integers use either long long or unsigned long long if they fit,
to avoid rounding errors. For all other numbers we use the double
type, with all the potential rounding errors that entails.
“Plain” Chunk Based Parsing
First define a simple block & an error handler. (These are just minimal
examples. You should strive to do something better that makes sense in your
application!)
id parser = [SBJson5Parser parserWithBlock:block
errorHandler:eh];
id data = [@"[true," dataWithEncoding:NSUTF8StringEncoding];
[parser parse:data]; // returns SBJson5ParserWaitingForData
// block is not called yet...
// ok, now we add another value and close the array
data = [@"false]" dataWithEncoding:NSUTF8StringEncoding];
[parser parse:data]; // returns SBJson5ParserComplete
// the above -parse: method calls your block before returning.
Alright! Now let’s look at something slightly more interesting.
Handling multiple documents
This is useful for something like Twitter’s feed, which gives you one JSON
document per line. Here is an example of parsing many consequtive JSON
documents, where your block will be called once for each document:
id parser = [SBJson5Parser multiRootParserWithBlock:block
errorHandler:eh];
// Note that this input contains multiple top-level JSON documents
id data = [@"[]{}" dataWithEncoding:NSUTF8StringEncoding];
[parser parse:data];
[parser parse:data];
Often you won’t have control over the input you’re parsing, so can’t use a
multiRootParser. But, all is not lost: if you are parsing a long array you can
get the same effect by using an unwrapRootArrayParser:
id parser = [SBJson5Parser unwrapRootArrayParserWithBlock:block
errorHandler:eh];
// Note that this input contains A SINGLE top-level document
id data = [@"[[],{},[],{}]" dataWithEncoding:NSUTF8StringEncoding];
[parser parse:data];
Other features
For safety there is a max nesting level for all input. This defaults to 32,
but is configurable.
The writer can sort dictionary keys so output is consistent across writes.
The writer can create human-readable output, with newlines and indents.
You can install SBJson v3, v4 and v5 side-by-side in the same application.
(This is possible because all classes & public symbols contains the major
version number.)
A word of warning
Stream based parsing does mean that you lose some of the correctness
verification you would have with a parser that considered the entire input
before returning an answer. It is technically possible to have some parts of a
document returned as if they were correct but then encounter an error in a
later part of the document. You should keep this in mind when considering
whether it would suit your application.
American Fuzzy Lop
I’ve run AFL on the sbjson binary for over 24 hours, with no crashes
found. (I cannot reproduce the hangs reported when attempting to parse them
manually.)
american fuzzy lop 2.35b (sbjson)
┌─ process timing ─────────────────────────────────────┬─ overall results ─────┐
│ run time : 1 days, 0 hrs, 45 min, 26 sec │ cycles done : 2 │
│ last new path : 0 days, 0 hrs, 5 min, 24 sec │ total paths : 555 │
│ last uniq crash : none seen yet │ uniq crashes : 0 │
│ last uniq hang : 0 days, 2 hrs, 11 min, 43 sec │ uniq hangs : 19 │
├─ cycle progress ────────────────────┬─ map coverage ─┴───────────────────────┤
│ now processing : 250* (45.05%) │ map density : 0.70% / 1.77% │
│ paths timed out : 0 (0.00%) │ count coverage : 3.40 bits/tuple │
├─ stage progress ────────────────────┼─ findings in depth ────────────────────┤
│ now trying : auto extras (over) │ favored paths : 99 (17.84%) │
│ stage execs : 603/35.6k (1.70%) │ new edges on : 116 (20.90%) │
│ total execs : 20.4M │ total crashes : 0 (0 unique) │
│ exec speed : 481.9/sec │ total hangs : 44 (19 unique) │
├─ fuzzing strategy yields ───────────┴───────────────┬─ path geometry ────────┤
│ bit flips : 320/900k, 58/900k, 5/899k │ levels : 8 │
│ byte flips : 0/112k, 4/112k, 3/112k │ pending : 385 │
│ arithmetics : 66/6.24M, 0/412k, 0/35 │ pend fav : 1 │
│ known ints : 5/544k, 0/3.08M, 0/4.93M │ own finds : 554 │
│ dictionary : 0/0, 0/0, 29/1.83M │ imported : n/a │
│ havoc : 64/300k, 0/0 │ stability : 100.00% │
│ trim : 45.19%/56.5k, 0.00% ├────────────────────────┘
^C────────────────────────────────────────────────────┘ [cpu: 74%]
+++ Testing aborted by user +++
[+] We're done here. Have a nice day!
I regret I’m only able to support the current major release.
Philosophy on backwards compatibility
SBJson practice Semantic Versioning, which
means we do not break the API in major releases. If something requires
a backwards-incompatible change, we release a new major version.
(Hence why a library of less than 1k lines has more major versions
than Emacs.)
I also try support a gradual migration from one major version to the
other by allowing the last three major versions to co-exist in the
same app without conflicts. The way to do this is putting the major
version number in all the library’s symbols and file names. So if v6
ever comes out, the SBJson5Parser class would become
SBJson6Parser, etc.
SBJson 5
Chunk-based JSON parsing and generation in Objective-C.
Overview
SBJson’s number one feature is stream/chunk-based operation. Feed the parser one or more chunks of UTF8-encoded data and it will call a block you provide with each root-level document or array. Or, optionally, for each top-level entry in each root-level array.
With this you can reduce the apparent latency for each download/parse cycle of documents over a slow connection. You can start parsing and return chunks of the parsed document before the full document has downloaded. You can also parse massive documents bit by bit so you don’t have to keep them all in memory.
SBJson maps JSON types to Objective-C types in the following way:
long long
orunsigned long long
if they fit, to avoid rounding errors. For all other numbers we use thedouble
type, with all the potential rounding errors that entails.“Plain” Chunk Based Parsing
First define a simple block & an error handler. (These are just minimal examples. You should strive to do something better that makes sense in your application!)
Then create a parser and add data to it:
Alright! Now let’s look at something slightly more interesting.
Handling multiple documents
This is useful for something like Twitter’s feed, which gives you one JSON document per line. Here is an example of parsing many consequtive JSON documents, where your block will be called once for each document:
The above example will print:
Unwrapping a gigantic top-level array
Often you won’t have control over the input you’re parsing, so can’t use a multiRootParser. But, all is not lost: if you are parsing a long array you can get the same effect by using an unwrapRootArrayParser:
Other features
A word of warning
Stream based parsing does mean that you lose some of the correctness verification you would have with a parser that considered the entire input before returning an answer. It is technically possible to have some parts of a document returned as if they were correct but then encounter an error in a later part of the document. You should keep this in mind when considering whether it would suit your application.
American Fuzzy Lop
I’ve run AFL on the sbjson binary for over 24 hours, with no crashes found. (I cannot reproduce the hangs reported when attempting to parse them manually.)
API Documentation
Please see the API Documentation for more details.
Installation
CocoaPods
The preferred way to use SBJson is by using CocoaPods. In your Podfile use:
Carthage
SBJson is compatible with Carthage. Follow the Getting Started Guide for iOS.
Bundle the source files
An alternative that I no longer recommend is to copy all the source files (the contents of the
Classes
folder) into your own Xcode project.Examples
NSURLSessionDataDelegate
to do chunked delivery.Support
SBJson
if you have questions about how to use the library.Philosophy on backwards compatibility
SBJson practice Semantic Versioning, which means we do not break the API in major releases. If something requires a backwards-incompatible change, we release a new major version. (Hence why a library of less than 1k lines has more major versions than Emacs.)
I also try support a gradual migration from one major version to the other by allowing the last three major versions to co-exist in the same app without conflicts. The way to do this is putting the major version number in all the library’s symbols and file names. So if v6 ever comes out, the
SBJson5Parser
class would becomeSBJson6Parser
, etc.License
BSD. See LICENSE for details.