Showing posts with label REST. Show all posts

Tuesday, March 31, 2009

REST in Django

So I'm now officially a Google Summer of Code mentor for the Django open source group! w00t! Now all I have to do is pull off all those cool ideas I came up with earlier, as well as have a group of open source developers all agree that they are as cool as I think they are (which could be easier said than done).

While brushing teeth this morning, was thinking "I wonder what sort of empirical study I could pull out of this situation? What sort of research is there to be done in the field of REST? What can I learn, for academic purposes, not my own, from this experience?"

Ideas? I'm going to mull it over at the gym.

Sunday, March 22, 2009

Modeling Topic

After a couple months of soul searching, the students enrolled in John Mylopoulos' Conceptual Modeling course are narrowing down ideas on what to model. The requirements of the project (to 'model something') left it fairly open, but perhaps too open to be immediately tractable. However, Michalis, Alicia, and I are closing the gap. I'm leaning toward modeling certain aspects of the REST web service framework I've been working on for the last little while in CSC 2125.

Obviously, system descriptions and class diagrams are uninteresting, as John pointed out earlier in the term, but modeling the requirements of what such a system should do are not. I'm looking into Tropos, but not really sure if it applies. However, we can probably do a goal model illustrating the motivation and principles of REST, contrasted with those of ws*. Also, a use case diagram describing the interface to a general ROA service could be created. Finally, we could model certain pieces of important logic, such as a delayed get, using a description logic syntax.

Tuesday, March 17, 2009

ORM-REST Code Sprint - Day 2

5:00 pm, day two of the code sprint is almost wrapped up. Not as much amazing progress today as I would have liked (I was minus one team member for some reason). That aside, here's what we've got:

Mohammad blocked out and implemented the pseudocode for the URL reverser described in our blog here. I took a further look at it, and filled in the magic that inverts the django url list and gives us a url from a view name and primary key value. Yay! Plugged it into the xml serializer, and voilla! Rest-like xml representation, with hyperlinks! The only thing missing from this bit is the 'http://hostname:port' part. From past experience, I've found this to be trickier than you might think (gets hairy if you've got one web server feeding into another, or a proxy/load balancer in the way). I think we'll try just using relative URIs for now.

After a couple more feature points are implemented, this thing needs a huge refactoring pass to clean it up and encapsulate it. Also, it uses some classes from the Django Rest Interface, but this library has some pathological faults that I want to not include in the tool. Yay for open licensing.

Also, discussions with Aran produced some new feature proposals. Lots of useful, tiny, easy-to-implement things that will improve the overall RESTability of the library. Further yay!

Automatic URL localization

Automatic anything localization

Model introspection and url pattern creation

Computed resources instead of data resources - make http interface automatic

Monday, March 16, 2009

ORM-REST Code Sprint - Day 1

At quarter to 5:00, the first day of the ORM REST code sprint is winding down. Mo' and I hacked from 2:00, and this is what we've got to show:

Rory finished one direction of proper xml serialization of Django models. That is, given a [list of] model instance[s], we get either a nice xml document (unlike the object name="" pk="" garbage we had before), or a concise list of objects with names and placeholder URIs, which will be changed to live urls when Mo' gets his piece working.

Mohammad synchronized with the svn, set up a django development environment, and familiarized himself with the code I had written. Following this, he began work on the reverse URL mapper. Given a model classname and a primary key value, he's pulling a live instance from the django ORM, and using it to query the Django URL dispatcher. This gives us the regular expression which will match URLs to access the specified object. Now he's got to turn the whole thing on its head! Good luck Mo! You can do it!

So, if you're keeping score, we are 1/2 + 1/2 = 1 feature point finished, out of 4. Might just make it by end of term :)

Tuesday, February 3, 2009

RESTful Questions

While working on my 2125 project, my partner and I created a quick little RESTful web service using CherryPy, and SQLAlchemy for persistence. SQLAlchemy worked wonderfully. CherryPy did a great job of making data-driven web pages, and the MethodDispatcher made it easy to invoke certain methods within a class when an http request comes in, based on the http method. This seemed almost ideal for REST, but some clunkiness in the design prevents it from being really what we're after.

What are we after, exactly? We're trying to find ways in which we can avoid duplication of effort when using both Object Relational Mappers and RESTful Web Services. In their book "RESTful Web Services", Richardson and Ruby hit on the point that the process of translating objects into REST resources is very similar to the process of translating the same objects into tables in a relational database. So, if a web service is storing objects in a database and exposing them via a rest api, we would be doing the same sort of mapping procedure twice.

My partner (in crime?) and I had a chat with Greg about this, and came up with some questions to investigate. Below are the questions and their answers:
How do REST APIs represent foreign-key relationships (ie. object aggregation)? Specifically, are references to the other objects stored/returned, or the entire object on each request?

It is common REST practice to return hyperlinks to other objects/resources that are aggregated by the given resource. This would require an additional http request for each referenced resource.

Can we uniquely identify object instances (REST resources) based on some identifier?

Yes. The resource's URI is its identifier. Every resource has one that identifies it. However, it is possible for one resource to have many URIs that point to it (ex. /releases/2_05 and /releases/latest could be the same thing).

Can we cache REST objects on the client side, based on their identifier (whatever that may happen to be)?

Yes. It would be silly not to. However, the multi-identifier problem stated in the last answer might make this less efficient.

If we assume that the meat of the service is some object graph (probably a DAG), can we reconstruct the graph on the client side, out of stubs instead of actual objects, given identifiers and caching?

I think so.

Thats all for now. Check here or on the project wiki for more information coming soon!

What I've Read This Week

Singer, J.; Vinson, N.G., "Ethical issues in empirical studies of software engineering," Software Engineering, IEEE Transactions on , vol.28, no.12, pp. 1171-1180, Dec 2002
URL: http://www.ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1158289&isnumber=25950

This paper presents a handful of ethical dilemmas that researchers who conduct empirical studies can get themselves into, along with advice on getting out or avoiding the situation all together.

What kings of studies could be create which contain no human subjects, but in which individuals can be identified (ie. from their source code)?
When can an employee's participation in an empirical study threaten their employment?
Is it possible to conduct a field study in which management doesn't know which of their employees are participating?
Should remuneration rates be adjusted to compete with a standard software engineer's salary?
Are raffles or draws valid replacements for remuneration? Does the exclusivity of the compensation (ie. only one subject wins the iPod) affect the data collected by the study? Will subjects 'try harder' in the task assigned if they think they may win a prize? Can prizes affect working relationships/situations after the researcher has left?
Does ACM Article 1.7 eliminate deceptive studies?
Regarding written concent/participation forms, does having a large number of anticipated uses of the data detract from a studies credability, and thereby make subjects less likely to participate?

John P. A. Ioannidis, "Why Most Published Research Findings Are False"
URL: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1182327

This paper describes a detailed statistical method (proof?) illustrating evidence that the majority of research papers published in this day and age go on to be refuted in the near future.

What is the 'power' the authors are referring to?
Is corollary 5 (corporations sponsoring research supress findings that they deem unfavorable for business reasons) just plain evil or misleading?
Null fields sound interesting. How do I tell if I'm stuck in a null field?
How do we determine R for a given field?

M. Jørgensen, and D. I. K. Sjøberg (2004) "Generalization and Theory Building in Software Engineering Research"
URL: http://simula.no/research/engineering/publications/SE.5.Joergensen.2004.c

Null hypotheses are a tell tale of (sometimes misused) statistical hypotheses testing. Should we as readers be concerned when we see clearly stated null hypotheses?
In their recommendations, the authors suggest that purely exploratory studies hold little or no value, given that vast amounts of knowledge concerning software engineering has been accumulated in other, older fields such as psychology. Although I agree that cross-disciplinary research is useful for SE, and many old ideas can be successfully applied in SE, I'm not sure I agree that there is no use in exploratory studies.
Proper definition of populations and subject sampling is important
It is difficult to transfer the results in one population to another. The most common example of this is performing a study on CS grad/undergrad students and expecting it to transfer to professionals. Is there any way we as CS grad students can perform studies that will be relevant to professionals, then?

Still working my way though RESTful Web Services. Just wrapped up the author's definition of ROA (resource oriented architecture). Very interesting. Hopefully this answers some questions brought up by my 2125 project.

Also on the stack are this paper about the Snowflock VM System and A Software Architecture Primer.

And, if there's time, I'll try to finish Ender's Game.

Thursday, January 22, 2009

ORM Mapping for Web Service Definition

This post is an experiment with the Blogger/Google Docs interoperability functionality. I'm not terribly impressed with the quality of the translation between doc and blog post. If you're as disgusted with the layout as I am, feel free to read the google doc here. This is a document describing an idea Greg proposed to me and another student in his CSC2125 class. In once sentence, the problem is : Can we use the object mapping definition from an Object-Relational Mapping tool to describe objects/resources in a RESTful web API, and in so doing leverage some of the benefits ORMs have lent to persistance, reduce redundancy, and generally make people happier? Unfortunately, the more I look into it, the more I think the answer is 'no'. However, we're not quite finished the investigation yet.

ORM Mapping for Web Service Descriptors

Traditional Situation

Figure 1 illustrates the traditional deployment situation for a client/server application, in which the client communicates with the server via an exposed web service, and the server persists data into a database using an object-relational mapper. The server-side process consists of a layered architecture, in which the business logic interfaces with the database via an ORM mapping layer. The application logic stores its data in the form of objects (hence the ORM), and a mapping is defined by the application programmer between the class definitions for these objects and a relational database schema.

The client side application code performs some useful operation with the data or service exposed by the server-side business logic. To access this information or service, the client application utilizes stub objects. These stubs expose the same interface as the live objects on the server (possibly a subset of methods for security/feasibility reasons), but the implementation of the object lives on the server; client side methods all contain logic for making calls to the server, and returning the response as if the method were implemented on the client. These stub objects are created automatically at build time by a tool which is able to read a description of the web service, and interpret into source code which can be compiled and used by the application. In traditional web services, this description is a WSDL file. <what is this for a REST web service??>

Figure 1: Traditional Client-Server Web Service Deployment, with ORM Mapping Definition and a Web Service Descriptor

Problem

The problem with this deployment is that there are redundencies in the way the shared objects and web service is defined, which could be streamlined to the benefit of both server-side programmers and clients who wish to interface with the server. In the event that the class definition for one of the shared objects changes (ex. adding a new public member), the server-side application programmer must update both the ORM Mapping Definition file/logic, as well as the Web Service Descriptor, and the client-side programmer may be required to at least rebuild their application, to update the stubs (this is addresses in Versioning).

Single Mapping Definition

It has been proposed that the ORM Mapping Definition and Web Service Descriptor can be combined into one artifact, as the two separate documents both essentially describe how to serialize instances of a given class. With hopefully only a small amount of modification, an ORM Mapping Definition could be used to serve both these purposes. Also, if the ORM side of the interface to this artifact is properly preserved, it is hoped that it can still be used for many of the other functions the ORM layer uses it for, like relational database schema migration/exporting.

Note: it may be interesting to investigate this further, as the ORM Mapping Definition serializes an object's state, but a Web Service Descriptor would likely only describe an object's behavior/interface!

Figure 2: Client-Server Web Service Deployment, with single Shared Object Descriptor

Versioning

Following from the automatic schema migration/updating, questions are raised about how similar functionality could be used to span the client-server gap, not just the server-database gap. Obviously, client-side stub classes can't be updated completely automatically, as this would require rebuilding the application. However, the server could expose several concurrent versions of the same service, multiplexed based on a version field in incoming requests. As part of the schema update process, in addition to modifying the database, the ORM layer (or some other piece of code) could generate the infrastructure required to support backward-compatible calls to the web service API.

Rory Tulk's Blog