bson2json: just a quick & dirty converter, BSON→JSON, and rudimentary schema analysis, for preparing to migrate a MongoDB dump to LevelDB.


  1. Dump files are just concatenated serialized objects. BSON format is amenable to efficient streaming, so we just iterate over the “records” by piping a standard file/stream into a BSON parser, and stringifying each to JSON:
    fs.createReadStream process.argv[2]
    .pipe new require 'bson-stream'
    .on 'data',(o)->
    	console.log JSON.stringify o
  2. (console.log? WTF?)


  1. Collect (and count) variations in objects’ “schema”, to gain insight into variety of “documents” stored in that particular file.

    Mapping of signatures to counts, where signatures are sorted lists of properties stringified to JSON:

    .on 'data',(o)->
    	s=JSON.stringify ks
    	schema[s] or=0
    .on 'end',(o)->
    	console.log 'Schema variations:',(Object.keys schema).length,schema
  2. Also show a union of all schema variations:
    .on 'data',(o)->
    	for k in ks
    .on 'end',(o)->
    	console.log 'Keys (union):',Object.keys keys

EOF and callbacks?

  1. Node guarantees process won’t exit before all queued callbacks execute.
  2. But…?

Multiple files

  1. Out of laziness — quick & dirty, like we said — just looped over all files with a shell script:
    for f in *.bson
    do ./ $f > ${f%.bson}.json
  2. So files are processed sequentially, synchronously, in command line order (left to right).
  3. In a more invested migration script we’ll iterate over them recursively. Later.

Comments are closed.