sgenomics.org

Welcome to sgenomics.org

RSS Atom
MiaPar

The manuscript describing the work of the ProteomeBinders subgroup on data standards and exchange formats has finally been published. This paper focuses on the minimum information required for a protein affinity reagent experiment. The work was led by Sandra Orchard and Henning Hermajakob at the EBI. I've been interested in ontologies and data standards for some time, so its great to be able to put this into practise and help out on an important and emerging data format. The work is based off the well established PSI-MI (Molecular Interaction) data standards. This is built on to encompass the exchange of information about proteomics experiments that use affinity reagents such as antibodies.

The paper is available here.

text repeat analysis

I have seen a bunch of papers by the same author recently where I felt that the text was repeated between papers. In order to test out this idea I ran the text of these papers through some compression algorithms to see if the text was repeated significantly between papers.

The results will appear here again soon.

ELM Client

Quite a few people in the new Lab need access to the ELM Server. Thankfully, they provide a Web Service to connect to their database. We have developed a java client to this service and provided some example code to go along with it. At the moment all the code is in an eclipse project so we are just sharing the whole project with all of the required library jars and axis generated code. However, in the future we'll try to tidy that up a bit.

The code is available at UCD SVN Repo.

DAS2010

Its the time of year again where I visit Cambridge and learn about the changes in the Distributed Annnotation System. The conference was hosted as always at the Genome Campus. This year there isn't as much changing in the system as much as there is consolidation of existing systems. The talks focused on client development this year and it seems much more focus is on the delivery of data. Many of the talks introduced client libraries and talked about data representation. This is an encouraging direction for the system to take. The scale of some of the new DAS servers is pretty incredible. The enCore project brings together around 20 different bioinformatics groups in Europe. These massive EU framework programs can be pretty scary. However, given the amount of money that they attract they are able to provide services that are not possible for other smaller projects or groups to acheive. One such example is the easyDAS project. They offer free hosting for small projects that want to share data via a DAS server.

BookReview

Late last year I reviewed a book entitled "What a time I am having - Selected letters of Max Perutz" for the journal BioEssays. The review is out now and available from here Link out to Bioessays site. I won't bother adding much more here since its available there. But it was fun reviewing the book and I would recommend the process (and the book) to others.

The book is available on Amazon.

Managing Open data

I gave my first talk yesterday at the UCD Bioinformatics Seminar Series. A couple of points. One I must remember to start using some kind of slide sharing service in order to preserve my talks. I'm impressed by the ability of others to maintain a record of their public speaking events.

The other major point that emerged from this was a discussion of Open Data and the challenges that opens up. I've already expressed an interest in hosting DAS services or Webservices on cloud infrastructure and I'll hopefully get some time to work on this in a couple of months time. I came across this article today on the topic of Open Data and it reminded me to restart a couple of data sharing projects. Hosting these in the cloud is really attractive for researchers since it dramatically lowers the costs and most of the time the amount of usage expected is lower than the pay threshold for these services.

I'll be putting the SLiMFinder code up on a svn server at some point in the near future. Hopefully that will spur some increased development of that code. There is a mailing list on googlegroups as well if people are interested.

dazzle On GAE

I've been following the work of Vincent Rouilly at the Parts Registry in MIT and trying to get the DAS server dazzle working on the Google App Engine. So far it seems to work though I can't seem to get the datasources working correctly. More updates to follow.

Utopia Documents

The Advanced Interfaces Group in Manchester has been developing tools for biologists for some time. The structure viewer Cinema and tools like Ambrosia have been demonstrated to be useful in the past. I thought I would draw attention to their latest offering Utopia Documents available from here:

It's pretty powerful and for my money should be your default reader for PDF's replacing preview or Acrobat.

Wave Robot

I've been playing around with google wave for the last month or so. I have to say I like it and am impressed by the potential. I'm interested in hearing from others interested in hooking up a Distributed Annotation System server to googlewave. I've seen some stuff about hosting a DAS server on the google app engine. I reckon this might be an interesting way to solve the writeback problem.

I've written a small robot based on buglinky to handle creating links to the ?QuickGo service from the EBI. I'd like to extend this to offer searches against the ontology lookup service and provide links to that. The robot is available at bioontorobot@appspot.com. There is also a GWT front-end to the service but this doesn't really capture the potential of the service.

Anyway - I'd be interested in collaborating on this with others interested in biological ontologies and data integration.

It seems that UCD have an office that manages open source development. We will be shortly approaching them to see if they will help develop the wave robots.

New Problems

I have recently moved lab. Whilst I will continue to push updates to the EpiC web resource, I will not continue to use this blog for solely development related information. I intend to also draw attention to interesting articles that I have come across and to a lesser degree give an indication of the work that I will be doing at UCD. My hope is that people find it interesting and useful, but also for this to act as place for me to improve my communication skills.

So, the topic of the first post is on a couple of articles I read recently on the problems associated with being a young post-doc looking to make a name for oneself. The first relates to the culture that has arisen particularly in the UK, with which I am most familiar, that measures scientific output in terms of publications and according to some is driving people out of science and into other careers. I'm not sufficently along in my career to comment on this from a personal perspective however, I do hear this compliant often from other post-docs, the occasional young PI and rarer still from senior academics. It makes for a fairly depressing read: PLoS Biology.

The second article focuses on choosing appropriate problems for scientists. I think the paper makes some good points. Its the sort of article that people read and say that they do all the time. I think the reality is that people like to believe they do this sort of thing but don't in practice. He has a html version here for those without a sub to Molecular cell How to choose a good scientific problem.

Last edited Thu Sep 10 10:34:17 2009