@hackage clustertools0.1

Tools for manipulating sequence clusters

This is a bunch of stuff I needed at some for manipulating sequence clusters. See the README for details. The tools included are:

  • filter - remove unwanted sequences from a clustering

  • hist - produce a histogram of cluster sizes from a "label"-formatted clustering.

  • clusc - compare clusterings, calculating numerous pair-based and entropy based indices.

  • add_single - add singletons to a clustering.

  • ace2contigs - parse an ACE assembly file, and output the contigs in a FASTA file.

  • ace2fasta - parse an ACE assembly, and output each assembly in a separate FASTA file

  • clusterlibs - given a table of regular expressions and library names, along with a clustering (TGICL-format), output a table of clusters with the library name prepended to the sequences.

The Darcs repository is at: http://malde.org/~ketil/cluster_tools.