Wednesday, December 16, 2009

Target Platform to have $HOME/.eclipse/$ECLIPSE/plugins ??

Bioclipse uses a custom target-platform... but since I am using the Eclipse 3.5 from Ubuntu now, extra features I download (GEF, BIRT, EMF, ...) end up in $HOME... so, I need to add a folder to the target-platform.target file... but I cannot find the variable for $HOME/.eclipse/org.eclipse.platform_3_foo_bar/plugins, such the files has things like @{eclipse_home}/plugins too...

Right now I have the below, but that clearly is not the solution:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?pde version="3.5"?>

<target name="Spring-osgi-1.0.2">
  <locations>
    <location path="${eclipse_home}" type="Profile"/>
    <location path="${project_loc}/spring-osgi-1.0.2/dist" type="Directory"/>
    <location path="${project_loc}/libs" type="Directory"/>
    <location path="${project_loc}/spring-osgi-1.0.2/lib" type="Directory"/>
    <location path="${eclipse_home}/plugins" type="Directory"/>
    <location path="/home/egonw/.eclipse/org.eclipse.platform_3.5.0_155965261/plugins"
                 type="Directory"/>
  </locations>
</target>

What is the @{variable} I should use for user-based plugin folders?

Saturday, November 7, 2009

Call for Collaboration: JavaDoc validation with OpenJavaDocCheck

Dear all,

while it is still very much in progress, I have made good progress with writing a new BSD-licensed replacement for Sun's DocCheck utility for testing the library's JavaDoc quality. The DocCheck's never really satisfied me, and the most recent version is ancient. Because it is closed source, no one can continue on those efforts. DocCheck is MIA.

PMD is given nice, simple overviews instead. It provides me with a quick overview of what is wrong with the CDK JavaDoc, and also provides a decent XML format which allows extraction of information, which is used by, for example, SuperNightly as showed yesterday in PMD 2.4.5 installed in the CDK 1.2.x branch.

I have been pondering about it for a long time now, but writing a JavaDoc checking library is hardly core cheminformatics research; at least, you would not get funding for it, despite everyone always complaining about good documentation. Alas.

A few weeks ago, I was reviewing some more code, and again saw the very common error of the missing period at the end of the first sentence in JavaDoc. This one is sort of important for proper JavaDoc documentation generation, but the complexity of the current DocCheck reporting, people are not familiar enough with it. Being tired of having to repeat myself, I decided to address the problen, but creating better Nightly error reporting for the CDK JavaDoc.

So, I started OpenJavaDocCheck, or ojdcheck. As mentioned, I have made quite promising progress, and the current version provides the ability to write custom tests (which I plan to use for validating content of CDK taglet content), and create XML as well as XHTML which can be saved to any file. To give you a glimps of where things are going, here's a screenshot of the current XHTML output:



This list shows a mix of tests that are now implemented in OpenJavaDocCheck itself, but the third line is actually a test that is plugged in and specific for the CDK. This is an important feature, I think, and allows users of OpenJavaDocCheck to add functionality is that is not interesting to the general public, but very interesting for the JavaDoc being analyzed. Well, at least, it is to the CDK project :)

The current list of tests is still quite small, and consists of these tests:
  • test if each class and method has JavaDoc
  • test for missing @return tags
  • test for missing @param tags
  • test for @returns instead of @return
  • test @param template code, such as added by IDEs like Eclipse
  • test @exception template code, such as added by IDEs like Eclipse
  • test for redundant @version tags
I am now seeking feedback on the current code base, and potentially collaboration with writing more JavaDoc validation tests. There is enough to do, and I have been thinking on tests for:
  • spell checking JavaDoc
  • checking for 404s of web pages linked with <a href> in the JavaDoc
  • well-formedness of the HTML in the webpages
And about:
  • a PMD-like system to allow people to choose which testing they want or not
  • an Eclipse plugin

Wednesday, November 4, 2009

New Bioclipse Features: Kabsch Alignment, RMSD Distance and Tanimoto Simarlity Matrices

We recently submitted a second paper on Bioclipse, and have worked hard in the past two weeks on addressing the reviewers' questions (and we love these feature requests! See also these two blogs). One reviewer seemed very interested in seeing docking available in Bioclipse. While we do not have a full docking feature set up for Bioclipse, we do have functionality to deal with 3D structures, though our researched urged us to focus on the 2D side of cheminformatics so far.

To strengthen our intentions towards the 3D cheminformatics world, we have implemented a few new features, using CDK functionality. For example, we added Kabsch aligment and the related RMSD between molecular structures implemented as both popup menus as well as manager methods. The manager method you can see in action in MyExperiment workflow 937, which you can download directly into Bioclipse with one simple command (see Bioclipse Manager for MyExperiment.org):
var smileses = new Array("CC(C)C", "CCCN", "CCC=O");

var unaligned = cdk.createMoleculeList();
for (i=0; i<smileses.length; i++) {
  mol = cdk.fromSMILES(smileses[i]);
  mol = cdk.generate3dCoordinates(mol)
  unaligned.add(mol);
}

var aligned = cdk.kabsch(unaligned)

jmol.load(aligned.get(0));
for (i=1; i<aligned.size(); i++) {
  jmol.append(aligned.get(i));
}
Now, we do have to update the use of Jmol in Bioclipse, and a big overhaul is scheduled for the 2.4 released in February next year. But you get the idea.

As said, there are two stories to adding this new functionality. Because we want all GUI interaction the user performs to be recordable (Scientist 1: What did you do to get those nice results? Scientist 2: I pushed that button in the that long menu. Scientist 1: What button is that? Scientist 2: Wait, I send you the BSL script with a Google Wave.)

The managers that allow this recording is Bioclipse specific, and also the reason why it would not be trivial to make a general Bioclipse plugin for Eclipse... some Spring magic is used to inject the managers into the JavaScript language. Anyway, the second thing is to add a GUI element, like popup menus. Now, this is a particular area where Eclipse excels. Now, I did have to ask for the details, as I am not using this daily (I'm doing science, not IT), but Ola was kind enough to give me the pointers for it.

The below configuration snippet links the pop up action to Bioclipse Navigator content (you know, where your MDL SD, CML, script and other files show up in Bioclipse). But only if I have selected 3 or more files! And, only if those files are actually some molecular content with 3D coordinates! And Bioclipse inherits this functionality by using the Eclipse platform.
<menuContribution
  locationURI="popup:org.eclipse.ui.popup.any?after=additions">
  <command
    commandId="net.bioclipse.cdk.ui.handlers.kabschAlignment"
    label="Perform Kabsch Alignment"
    icon="icons/molecule2D.png">
    <visibleWhen>
      <with variable="selection">
        <count value="(2-"/>
        <iterate operator="and" ifEmpty="false">
          <adapt type="org.eclipse.core.resources.IResource">
            <or>
              <test property="org.eclipse.core.resources.contentTypeId"
                       value="net.bioclipse.contenttypes.cml.singleMolecule3d"/>
              <test property="org.eclipse.core.resources.contentTypeId"
                       value="net.bioclipse.contenttypes.cml.singleMolecule5d"/>
              <test property="org.eclipse.core.resources.contentTypeId"
                       value="net.bioclipse.contenttypes.mdlMolFile3D"/>
            </or>
          </adapt>
        </iterate>
      </with>
    </visibleWhen>
  </command>
</menuContribution>
When Bioclipse is run, this looks like:



And the alignment results will nicely show up in a Jmol viewer (while it is implemented as an Eclipse editor, it is not yet):


The first screenshot also shows the new pop-up menus for calculating two matrices for 3 or more molecules. One is based on the RMSD of the 3D atomic coordinats of the atoms in the MCSS (BTW, Asad's SMSD work is making its way into the CDK library, and will be available in a later Bioclipse version too.) and will create a distance matrix. The second new pop-up menu used the Tanimoto similarity measure based on CDK fingerprints on the selected chemical graphs. If the Bioclipse Statistics feature is installed, the created CSV files will open up in a matrix editor:


Wednesday, September 23, 2009

Extension point for running JUnit tests in a RCP Application instance?

One thing that has been on my wishlist is to be able to run the unit tests we have for Bioclipse from inside a running Bioclipse instance. That is, we have a Bioclipse Test Suite features on the update site, matching the functional features we have there. Each such test suite would run all JUnit tests we have for that feature.

The good thing about this is twofold:

  1. users can verify that their installation is working as intended
  2. the development team can easily run the test suite on foreign systems, without the need to install a fully operational Eclipse with Bioclipse development workspace
Now, the tricky thing is likely the following. How do we get to run all test suites? That is, I don't want to need to have to run the suites for each feature separately. Of course, this is exactly what extension points are for.

So, my question is, did anyone set up an system like this? And, is there an extension point that allows features to plugin additional JUnit test suites into a larger test suite dynamically?

Monday, September 7, 2009

New Plug-in Wizard template: Can I add Import-Package programmatically?

Dear Planet Eclipse readers, please take notice of my problem with adding an Import-Package to the MANIFEST.MF using the Plug-in Wizard templating mechanism. Any suggestions and pointers very much appreciated! I'd really like to remove step 7 from the following tutorial:

Last Friday, the Bioclipse 2.1 development series moved to Eclipse 3.5, so I had to update the Bioclipse SDK too, which we developed earlier.

With a new Eclipse version also comes new screenshots to talk you through the process of setting up a new Bioclipse manager plugin.

Step 1
Right click in your workspace navigator, and choose New -> Project:



Step 2
And select to create a new Plug-in Project:



Step 3
Give a project name, such as net.bioclipse.xml:



Step 4
Tune the ID, Version, Name, and Provider to your liking:



Step 5
Then select Bioclipse Manager:



Step 6
The next wizard page is specific the the Bioclipse manager, and asks a manager namespace, which will be used as prefix in the JavaScript Console. For example, if I make the namespace xml, then I will type xml.someMethod() inside the JavaScript. The default manager name is typically OK by default:

Then click Finish and let Eclipse set up the new project.

Step 7
Because I have not figured out yet how to add Import-Package to the MANIFEST.MF programmatically, you will have to do this manually. Add the last line of the next screenshot to the MANIFEST.MF of your new plugin:



Update: I found a hack to add the Import-Package programmatically, by overwriting the execute(IProject project, IPluginModelBase model, IProgressMonitor monitor) in the Template class.

Saturday, August 29, 2009

Reminder: my talk in Frankfurt on Monday; Want to meet up?

Quick and short reminder about my Open Knowledge: Reproducibility in Cheminformatics with Open Data, Open Source and Open Standards talk on Monday. The session is great anyway, with other talks from Cameron, John and someone from Berlin on a Open Access HTS system (which reminds me to talk about the Open Access and that the term is tainted).

I still have a free program, other than I want to see Google Wave in action (and while I have receive my invitation, I have not received a login account yet). There is a potentially interesting talk about Second Generation Small Molecule Therapeutics by 15:00. But no plans otherwise for the afternoon and/or evening.

If you like to talk about CDK, Bioclipse and/or the Blue Obelisk movement. Or about my talk on Open Data, Open Standards and Open Source (ODOSOS) in chemoinformatics.

If you happen to be around the Frankfurt Westend campus. In building 4, I think, the Hörsaalzentrum, where the conference is. Please let me know if you like to meet up. I hope to be online :), but no promise on that... should work at a Uni location, not? Let's see... This is how to ping me, and don't worry about redundancy.

Email: egon.willighagen at gmail dot com
IRC: #cdk at irc.freenode.net
Twitter: egonwillighagen
Identica: chemblaics
Blog: just leave a reply to this message

Monday, August 17, 2009

Bioclipse and SPARQL end points

Last week, there was a very interesting thread on the DBPedia mailing list, on using Java for doing remote SPARQL queries. This was one of the features still missing in bioclipse.rdf. Richard Cyganiak replied pointing the code in Jena which conveniently does this and which bioclipse.rdf is already using anyway. Next, Fred Durao even gave a full code example relieving me from any further research, resulting in sparqlRemote() now implemented in the rdf manager:
> rdf.sparqlRemote(
"http://dbpedia.org/sparql",
"select distinct ?Concept where{[] a ?Concept } LIMIT 10"
);
[[http://dbpedia.org/ontology/Place], [http://dbpedia.org/ontology/Area],
[http://dbpedia.org/ontology/City], [http://dbpedia.org/ontology/River],
[http://dbpedia.org/ontology/Road], [http://dbpedia.org/ontology/Lake],
[http://dbpedia.org/ontology/LunarCrater],
[http://dbpedia.org/ontology/ShoppingMall], [http://dbpedia.org/ontology/Park],
[http://dbpedia.org/ontology/SiteOfSpecialScientificInterest]]
I reported earlier two example SPARQL queries for chemistry, which can now be rewritten as Bioclipse scripts:

and

Thursday, August 13, 2009

Making Bioclipse Development easier: the New Manager Wizard

Today, Jonathan, Carl, Arvid and I made writing managers for Bioclipse a bit easier. Plug-in development Eclipse in itself is already tricky to learn, and the use of Spring by the Bioclipse managers is not helping. And because very soon two new people will be starting with writing a new manager rather soon, we thought it was time to lower the activation barrier a bit.

The basic file structure of an Bioclipse manager looks like:
net.bioclipse.foo/
|--META-INF
| |--MANIFEST.MF
| `-- spring
| `-- context.xml
|-- plugin.xml
|-- .classpath
|-- .project
|-- build.properties
`-- src
`-- net
`-- bioclipse
`-- foo
|-- Activator.java
`-- business
|-- FooManager.java
|-- FooManagerFactory.java
|-- IFooManager.java
|-- IJavaFooManager.java
`-- IJavaScriptFooManager.java
That is twelve files which need to be just right. I used to copy/paste from an earlier (simple) manager.

But we know and understand that setting up this framework is even more challenging if you have not done this at least 10 times before. So, today we implemented a New Wizard (source available from this Git repository: bioclipse.sdk).

It just asks you a project name:

and a few other settings:



Installing the Bioclipse SDK
Installing this new plugin is fairly easy, and we have set up an Update Site at http://pele.farmbio.uu.se/sdk/. Just add this as Update site in Eclipse 3.4.x (which is still required for Bioclipse2). It depends on the JDT and PDE, which you will likely already have installed being part of the default Eclipse RCP release.

Go to the Software Updates in the Help menu:

and pick Add Site.... Enter the aforementioned update site as shown here:

Then, select the Bioclipse plugin:

After you hit Install and Eclipse install the fews tens of kBs of the plugin, the plugin should show up in your installation, like it did in mine:



Implementation Details

Writing the plugin was a challenge to me, and I am happy we were doing this in a hackaton. The Bioclipse-QSAR project already had a New Project wizard, but not for a new Plug-in Project. Some things are just slightly different then. For example, it turned out that creating a .classpath cannot be done in the regular way (it never showed up), and I had to dig up some internal code of the PDE. Actually, our current implementation is still using a few internal classes because of this:
IClasspathEntry[] entries = new IClasspathEntry[3];
String executionEnvironment = null;
ClasspathComputer.setComplianceOptions(
project,
ExecutionEnvironmentAnalyzer.getCompliance(executionEnvironment)
);
entries[0] = ClasspathComputer.createJREEntry(executionEnvironment);
entries[1] = ClasspathComputer.createContainerEntry();
IPath path = project.getProject().getFullPath().append("src/");
entries[2] = JavaCore.newSourceEntry(path);
Ideas are most welcome on how to clean up this code, and not make it use internal, non-exported classes. For the Java source files and even the MANIFEST.MF we are using templates, though I have seen this file being created programmatically too.

I'm sure we'll run in some needed plumbing here and there, but that's what update sites are for, not? Release soon, release often is an Open Source concept that works well in the Eclipse world.

Friday, August 7, 2009

Searching PubChem from within Bioclipse

For the application note which we are about to submit, I was working on improving the PubChem Bioclipse API a bit, resulting in new download methods:



The search allows using PubChem Filters which provides many simple means to restrict the search results. For example, we can search molecules and restrict on the molecular weight:
lists = pubchem.download(pubchem.search("malaria 300:500[MW]"))
Other filters you can use in pubchem.search (provided by PubChem itself), includes (with examples):
  • [el]: pubchem.search("Au[el]")
  • [inchi]: pubchem.search("\"InChI=1S/CH4/h1H4\"[inchi]")
  • [inchikey]: pubchem.search("VNWKTOKETHGBQD-UHFFFAOYSA-N[inchikey]")
  • [mimass]: pubchem.search("375.9785:375.9786[mimass]")
And many, many more... see the linked Filters page.

Now, you surely want to look at the hits, for which we use the molecular table editor:
list = pubchem.download(pubchem.search("375.9785:375.9786[mimass]"))
cdk.saveSDFile("/Virtual/hits.sdf", list)
ui.open("/Virtual/hits.sdf")
Resulting in:

Wednesday, August 5, 2009

Running Bioclipse Plugin Unit tests: solving the XPCOM error

Sometimes you can feel so stupid. For example, when the answer is right on front of you, but only after many hours you realize the right question belonging to that answer. For example, take this answer:
    add the line: -Dorg.eclipse.swt.browser.XULRunnerPath=/usr/lib/xulrunner
This is the problem I was trying to solve: I'm running 64bit Ubuntu Jaunty with Eclipse 3.4.2 for Bioclipse development. The answer above is the correct answer. So, I added the line. To the $HOME/eclipse.ini and to the eclipse command line to start the program. But I still good not run Bioclipse plugin unit tests; I kept getting that stupid error:
    org.eclipse.swt.SWTError: XPCOM error -2147467262
    at org.eclipse.swt.browser.Mozilla.error(Mozilla.java :1638)
    at org.eclipse.swt.browser.Mozilla.setText(Mozilla.ja va:1861)
In retrospect, I was sort of asking the wrong question. I should have asked myself not why I got that XPCOM error even though I was using the solution, but why running the unit tests was not affected by that solution. Realizing that, it became so obvious: the plugin unit testing was using a clean environment, not based on the Eclipse environment I was working in; therefore, adding that line to my Eclipse environment did not help. Instead, I only had to that line to the Run Configuration of my plugin unit tests too:

Surely, there are aspects to this which helped me overlook this solution. For example, I had installed Eclipse freshly yesterday, and then the it worked fine. Only after installing some EMF and GEF features, it stopped working again. Bitten by the correlation/causation pattern :(

Sunday, June 21, 2009

Chemistry in Eclipse: Bioclipse-JChemPaint

The Uppsala and EBI CDK-teams have been working hard on finishing the rewrite of JChemPaint I started with Niels earlier. While the EBI-team focused on the applet (and Swing application), the Uppsala team, obviously, focused on the SWT side, for integration into Bioclipse. The new JChemPaint is reaching a useful state, and below is a quick update screenshot something Arvid has been working on:

It shows a periodic table which allows you to drag any element type onto the JChemPaint drawing area. It is using regular drag and drop functionality, allowing you to create any arbitrary pseudo atom too. This also paves the way for a template system, allowing you to drag-n-drop fragments onto an active JChemPaint editor.

Tuesday, May 19, 2009

Eclipse-Spring Export problem: uses conflict for spring aop

And I just got around to grasping some more of the details of handling dependencies with plugins. Bioclipse has a product file, which uses features to run it. That works fine. It also uses Spring (version 1.0.2 in our case) in a set up where we use custom managers to do stuff, like run things from a JavaScript environment (e.g. this), but use the same methods to be run from Bioclipse GUI elements, like buttons, menus and wizards. The Spring framework ensures the proper thread is used, and also provide recording (thanx to Jonathan for doing all this).

However, a recent refactoring broke exporting Bioclipse (which introduces a new plugin net.bioclipse.managers), It still runs fine from within Eclipse, but the exporting fails with this error:



Last night, I tried many things, on top of what Jonathan has been trying for the last few days. Using git bisect I pinpointed the exact commit that caused it to fail (which is the earlier linked revision 10373), but could not find where we are actually referring to multiple Spring bundles. Our Spring bundles are in these jars and that has worked for a very long time.

I have no clue why it finds this conflict, and am clueless on how to further debug the issue. Any comment is most welcome! No matter how insignificant it may seem, I am sort of stuck and any tip will likely allow me to move forward. Thanx!

Monday, May 11, 2009

Which feature must I install for org.eclipse.zest?

Dear lazyweb!

I have been trying to figure out which Eclipse 3.4 feature I must install from the update site to get the org.eclipse.zest plugin in my environment.

I installed the Zest feature (which I am going to use to visualize an RDF network), but my workspace still complained that I did not have the plugin.

Maybe I should rerun Set Target Platform for our product, but I and others in the Bioclipse development community have been wondering how we can know what feature to install via the Software Updates... to get a particular plugin on your machine?

Looking forward to hearing from you,

Kind regards,

Egon

Wednesday, April 15, 2009

Bioclipse2 Scripting #3: XLogP calculatation using a XMPP CDK cloud service

In preparation of the CDK workshop next week, here is a small Bioclipse2 script to calculate the XLogP value for a given SMILES, using the a CDK-based XMPP service:


Earlier in this series:

Multiple inheritence for content types?

Bioclipse is an environment for handling and processing life sciences data. This data is present in files with a wide variety of formats, each of which can contain a particular data type. For example, a we can have a single molecule in MDL molfile and in CML.

The latter is particularly interesting, as I do not know how to work that out... Firstly, I want the CML (Single Molecule) content type extend the CML content type, so that a validating CML editor can open it with the proper schema, but at the same time I would like to extend it a content type representation a Single Molecule. Hence, the multiple inheritance.

This is what the plugin.xml currently looks like:
<extension
point="org.eclipse.core.runtime.contentTypes">

<content-type
base-type="net.bioclipse.contenttypes.cml"
id="net.bioclipse.contenttypes.cml.singleMolecule2d"
name="CML (Single 2D Molecule)"
priority="high">
<describer class="net.bioclipse.cml.contenttypes.CmlFileDescriber">
<parameter
name="dimension"
value="2D"/>
<parameter
name="cardinality"
value="single"/>
</describer>
</content-type>

</extension>
Very clearly, a single base-type. Is there any option of multiple inheritance?

Tuesday, April 14, 2009

Bioclipse: a powerful Jmol application

While Bioclipse is much more, it could be an interesting alternative to the Jmol application. It offers:
  • a scripting console
  • a file browser (the Eclipse way)
  • an outline of the file content which allows selections
  • a script editor
The underlying RCP toolkit has many other interesting features for a Jmol application, but the above is up and running:

Monday, March 23, 2009

Highlighting Console output in Eclipse with Grep Console

I ran into an Eclipse Grep Console plugin (EPL license) today that takes regular expression to color output in the Console. Given the amount of output Bioclipse and the CDK give when in DEBUG mode, this allows me to highlight those bits I am interested in. For example, comments on the Bioclipse managers:

Sunday, February 22, 2009

Solubility Data in Bioclipse #2: handling RDF

RDF is swiftly becoming the lingua franca of life sciences (see for example [1,2]). Bioclipse is an excellent platform to visualize results from analysis of the network, both for graph visualization (see [3]), as well of visualization of domain specific data types (e.g. sequences, molecules, ...).

Yesterday I uploaded a Bioclipse feature that adds a rdf manager to handle RDF content, which includes SPARQL support. The below snippet shows application to the solubility data [3]:

Maybe RDF support in Eclipse is an idea for its Google Summer of Code?

See also:
  1. One Billion Biochemical RDF Triples!
  2. RDF-ing molecular space
  3. Solubility Data in Bioclipse #1

Friday, January 16, 2009

Bioclipse and Gist integration

As you might have read, Bioclipse has scripting support (see for example, Scripting JChemPaint), and that we have been collection them on Gist and indexing them on Delicious with the tags bioclipse and gist. This provides a nice overview of what you can do with the current SVN version of Bioclipse2. And, hopefully, when released, allow users to quickly learn about Bioclipse features, allow people to share scripts etc. Think of it as MyExperiment.org for Bioclipse.

Now, what was missing until today, was easy access to gists in Bioclipse itself. No gist.load(33421) yet. There still is not, but I uploaded earlier today a Wizard for it. (The manager will follow later). Right click on an open Project, select New -> Other, and pick Download Gist:

and click Next:

Then, just type the number of the Gist you want to open in Bioclipse, for example 18315 (see Bioclipse2 Scripting #1: from SMILES to a UFF optimized structure in Jmol), and click another Next to select a file name and location:

The current code does require you to know the Gist number, so you'll need a web browser to look it up, but we do have search facilities in mind. Also, while the code attempts so, the resulting Gist is not automatically openend in an editor (a bug). Another idea is to just install the egit plugin in Bioclipse :)

Saturday, January 3, 2009

Editing and Validation of CML documents in Bioclipse

One advantage of using XML is that one can rely on good support in libraries for functionality. When parsing XML, one does not have to take care of the syntax, and focus on the data and its semantics. This comes at the expense of verbosity, though, but having the ability to express semantics explicitly is a huge benefit for flexibility.

So, when Peter and Henry put their first documents online about the Chemical Markup Language (CML), I was thrilled, even though is actually was still SGML when I encountered it. The work predates the XML recommendation. As I recently blogged, in '99 I wrote patches for Jmol and JChemPaint to support CML, which were published as preprint in the Chemical Preprint Server in a paper in 2000 in the Internet Journal of Chemistry. Neither of the two has survived.

Anyway, the Chemistry Development Kit makes heavy use of CML, and Bioclipse supports it too. Now, Bioclipse is based on the Eclipse Rich Client Platform architecture, for which there exist quite a few XML tools in the Web Tools Platform (WTP). Among these, a validation, content assisting XML editor. This means, I get red markings when I make my XML document not-well-formed or invalid. Just a quick recap: well-formedness means that the XML document has a proper syntax: one root node, properly closed tags, quotes around attribute values, etc. Validness, however, means that the document is well-formed, but also hierarchically organized according to some specification.

Enter CML. CML is such a specification, first with DTDs, but after the introduction of XML Namespaces with XML Schema (see There can be only one (namespace)). The WTP can use this XML Schema for validation, and this is of great help learning the CML language. Pressing Ctrl-space in Bioclipse will now show you what allowed content can be added at the current character position.

Yes, Bioclipse can do this now (in SVN, at least). This has been on my wishlist for at least two years now, but never really found the right information. Now, three days ago David wrote about End of Year Cramps in which he describes some of his work on the WTP for autocomplete for XPath queries. He see[s] a brighter future for XML at eclipse over the next year. I hope that those in the eclipse and XML community will help to continue to improve the basic support, so that first class commercial quality applications that leverage this support can continue to be built.

That was enough statement for me to ask in the comments on how to make the WTP XML editor aware of the CML XML Schema. It already picked up XML Schema's with xsi:schemaLocation, but I needed something to worked without such statements in the XML document itself. David explained that me that I could use the org.eclipse.wst.xml.catalog extension. This was really easy, and commited to Bioclipse SVN as:
<extension
point="org.eclipse.wst.xml.core.catalogContributions">
<catalogContribution>
<uri name="http://www.xml-cml.org/schema"
uri="schema24/schema.xsd"/>
</catalogContribution>
</extension>
However, that does not make the WTP XML editor available in the Bioclipse application yet. Not ever in the "Open With"... So, I set up a CML Feature. After a follow up question, it turned out that the CML content type of Bioclipse was already a sub type of the XML type (see ):
<extension
point="org.eclipse.core.runtime.contentTypes">
<content-type
base-type="org.eclipse.core.runtime.xml"
id="net.bioclipse.contenttypes.cml"
name="Chemical Markup Language (CML)"
file-extensions="cml,xml"
priority="normal">
</content-type>
</extension>
So, the only remaining problem was to actually get the WTP XML editor as part of the Bioclipse application. The new CML Feature takes care of that (I hope the export and building the update site work too, but that's yet untested), by important the relevant plugins and features. Last night, however, I ended up with one stacktrace which gave me little clue on which plugin I was still missing.

Therefore, I headed to #eclipse and actually met David of the blog that started this again. He asked nitind to think about it too, and they helped me pin down the issue. This relevant bit of the stacktrace turned out to be:
Caused by: java.lang.IllegalStateException
at org.eclipse.core.runtime.Platform.getPluginRegistry(Platform.java:774)
at org.eclipse.wst.common.componentcore.internal.impl.WTPResourceFactoryRegistry$ResourceFactoryRegistryReader.(WTPResourceFactoryRegistry.java:275)
at org.eclipse.wst.common.componentcore.internal.impl.WTPResourceFactoryRegistry.(WTPResourceFactoryRegistry.java:61)
at org.eclipse.wst.common.componentcore.internal.impl.WTPResourceFactoryRegistry.(WTPResourceFactoryRegistry.java:55)
... 37 more
This refered to this bit of code of Eclipse' Platform.java:
Bundle compatibility = InternalPlatform.getDefault()
.getBundle(CompatibilityHelper.PI_RUNTIME_COMPATIBILITY);
if (compatibility == null)
throw new IllegalStateException();

So, the plugin I turned to to have missing was org.eclipse.core.runtime.compatibility. Apparently, some parts of the WTP that the XMLEditor is using, still uses Eclipse2.x technology.

This screenshot shows the WTP XMLEditor in action in Bioclipse on a CML file. It shows the document contents with the 'Design' tab, which also shows allowed content, as derived from the XML Schema for CML. Also, note that the Outline and Properties view automatically come for free, which allows more detail and navigation of the content.

This screenshot shows the 'Source' tab for the same file, where I deliberately changed the value of the @id attribute of the first atom. The value does not validate against the regular expression defined in the CML schema for @id attribute values. It also shows the content assisting in action. At any location in the CML file, I can hit Ctrl-Space, and the editor will show me which content I can add at that location.

This makes Bioclipse a perfect tool to craft CML documents and learn the language.

Scripting JChemPaint

Stefan, Gilleain, Arvid and I had a JChemPaint Developers Workshop in Uppsala, to sprint the development of JChemPaint3, for which Niels layed out the foundation already a long time ago.

Gilleain and Arvid merged their branches into a single code base, while Stefan worked on the Swing application and applet. The Bioclipse SWT-based widget is being developed for Bioclipse2.

The new design separates widget/graphics toolkit specifics from the chemical drawing and editing logic. Regarding the editing functionality, this basically comes down to have a semantically meaningful edit API. This allows us to convert both Swing and SWT mouse events into things like addAtom("C", atom), which would add a carbon to an already existing atom. However, without too much phantasy, it allows adding a scripting language. This is what I have been working on. Right now, the following API is available from the Bioclipse2 JavaScript console (via the jcp namespace, in random order):
  • ICDKMolecule jcp.getModel()
  • IAtom getClosestAtom(Point2d)
  • setModel(ICDKMolecule) (for really fancy things)
  • removeAtom(IAtom)
  • IBond getClosestBond(Point2d)
  • updateView() (all edit command issue this automatically)
  • addAtom(String,Point2d)
  • addAtom(String,IAtom) (which works out coordinates automatically)
  • Point2d newPoint2d(double,double)
  • updateImplicitHydrogenCounts()
  • moveTo(IAtom, Point2d)
  • setSymbol(IAtom,String)
  • setCharge(IAtom,int)
  • setMassNumber(IAtom,int)
  • addBond(IAtom,IAtom)
  • moveTo(IBond,Point2d)
  • setOrder(IBond,IBond.Order)
  • setWedgeType(IBond,int)
  • IBond.Order getOrder(int)
  • zap() (sort of sudo rm -Rf /*)
  • cleanup() (calculate 2D coordinates from scratch)
  • addRing(IAtom,int)
  • addPhenyl(IAtom)
This API (many more method will follow) is not really aimed at the end user, who will simply point and click. The goal of this scripting language is, at least at this moment, to test the underlying implementation using Bioclipse. Future applications, however, may include simple scripts which use some logic to convert the editor content. For example, replacing a t-butyl fragment into a pseudo atom "t-Bu". The key thing to remember, is that this will allow Bioclipse to have non-CDK-based programs act on the JChemPaint editor content (e.g. using getModel() and setModel(ICDKMolecule)). More on that later.

A simple script could look like: Or, as screenshot:

A RCP dedicated blog

Yes, yet another blog by me. This one is special, being the first where I just copy/paste blog items from other blog I do. Just, so that people only interested in RCP related stuff, can tune in here. If only I knew how to do categories on blogger.com.