What is Swift?
Next-gen DNA sequencers produce a lot of data, the primary data is composed of sets of images, generally pictures of florescently tagged fragments of DNA, attached to beads or grown in clusters. Swift is an open source (LGPL3) package for processing that image data and extracting base calls. I (Nava Whiteford) started writing Swift while at the Wellcome Trust Sanger Institute under the direction of Clive Brown and Tony Cox and in collaboration with Tom Skelly. We are also grateful to Matt Ritchie and Christina Curtis at the CRI for their useful suggestions and advice.
Why is Swift?
The instrument vendors already supply software for processing this data of course. However, when source code is even available, it is not open source. This makes the development of this software by the community impossible, it also raises scientific concerns as the algorithms used to process the primary data are not open to peer review.In addition to this Swift also has a number of other design goals which make it attractive:
- A single binary that goes from images to basecalls (no intermediate files required, cuts down on IO)
- Parallelisable down to the tile level (so you can fire off 800+ runs on a cluster)
- Maintainable extendable C++ (change algorithms and parameters easily)
- Better algorithms.
- Fast!
Where can I get it?
Rather than waiting until Swift is perfect until it's released we've decided to let people play with it and hopefully attract some more developers. We'd like this to become a community project, and would welcome suggestions, comments and code contributions.
Swift is hosted on sourceforge at: http://www.sourceforge.net/projects/swiftng
You can download it from the subversion repository, on Linux do the following:
svn co https://swiftng.svn.sourceforge.net/svnroot/swiftng/trunk
Documentation is available under the documentation directory.
A set of tile data, which you can use try Swift out is available here.
We also have a mailing list here.
You are welcome to mail me (Nava Whiteford) directly at: new at sgenomics dot org.