Tarlog on Java: April 2012

Monday, April 16, 2012

REST Best Practices: Using HTTP Verbs

The common mistake about RESTful API that they MUST be CRUD-like: Create maps to POST. Read maps to GET. Update maps to PUT. Delete maps to DELETE.

In fact this is incorrect. They MAY be CRUD-like or MAY be something else. The rule is: each resource MUST have a set of predefined operation. But not necessary CRUD. Even not necessary GET-POST-PUT-DELETE should be used. In fact did you know that HTTP/1.1 defines 8 HTTP Verbs: HEAD, GET, POST, PUT, DELETE, OPTIONS, TRACE and CONNECT? And in addition you may define your own HTTP Verbs that will behave as you wish? For example WebDAV defines the following verbs in addition to the standard ones: PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK.

So now, when you know that you may not stick to the standard HTTP verbs, I still advocate you to do so. Why? It's relatively clear what action these methods will do with the resource. It's may be less clear what action your own verb will do.

So let's consider that 90% of RESTful APIs will use only the standard HTTP verbs. What's important to pay attention to? The most important rule is to use the HTTP verbs correctly. The most common example of an incorrect use of the GET verb can be taken from HTML forms: in HTML both GET and POST can be used to submit data to a server from a form. But actually GET is a safe method, it means that it SHOULD NOT have the significance of taking an action other than retrieval.

So remember, Rule 1: HEAD and GET are safe methods.

Now about the Idempotent Methods. Methods are idempotent if the side-effects of N > 0 identical requests is the same as for a single request. The idempotent methods are GET, HEAD, PUT and DELETE.
Some examples what does it mean:
1. PUT can be used to create a resource. Yes, I know that the best practice say that it is POST, who creates resources and in a second I'll explain why. So let's say PUT creates a resource, but if a request is duplicated for some reason, the second request must not create a new resource and return exactly the same response as the first request. It's important to understand that various HTTP intermediaries know that PUT is idempotent and for example they may apply some caching, etc. So actually POST is much safer, especially for creating resources. But pay attention: POST may be safely used also for updates and even fetching the resource! Nobody said it is only for creation.
2. When DELETE-ing the resource, pay attention that the second DELETE request should return the same response. So if a first response returns "200 OK", the second (third, fourth, etc) DELETE requests to the same resource also should return "200 OK". Pay attention that once the resource was deleted, the GET request to this resource should return "404 NOT FOUND", but DELETE should continue returning "200 OK".

So Rule 2 will be: use idempotent methods correctly. If you cannot ensure that you use them correctly, use other methods and POST can be a good choice.

Now a little about the methods that you are unlikely to implement yourself. The behavior of the HEAD, TRACE, CONNECT and OPTIONS is well defined and they are usually implemented by the infrastructure (CONNECT and TRACE by the web servers, HEAD and OPTIONS by the frameworks, like Servlets or JAX-RS). If, for some reason, you decide to implement one of these methods, make sure your do it correctly.

Summary

1. Don't try to map CRUD to HTTP verbs. There is no need to do it.
2. Create new HTTP verbs, if needed.
3. Use existing HTTP verbs correctly, pay special attention to the safe methods (HEAD and GET) and to the idempotent methods (HEAD, GET, PUT and DELETE).
4. If you are now sure which HTTP verb to choose, use POST.

Wednesday, April 11, 2012

REST Best Practices: Use JSON

Something that RESTful design does not care about, is the actual representation. The concept say that the resource can be represented in one representation or more. It does not affect the resource itself. The JAX-RS solved it very beautifully with a clear separation between the Resources and Providers.

And then happens a real life, and you need to implement a resource and represent it somehow. The native choice of the programmers who come to the RESTful world from SOAP is XML.
There are several reasons for it, and the main one: they were used to xml and why change something that works? Or at least worked?
Wrong! You don't need to stick to something you have used to use. You can use something better. And JSON is better. Why? The main reason: it's more simple - it has no namespaces and no attributes. It's less to write, less places to make a mistake in.
The compatibility issues with JSON are solved better, since you don't have xsds and you cannot fail with the validations. The new fields will be silently ignored by the old version and that's all.
You can always create XML out of JSON in case your client prefer to get XML. But it's not so easy to create JSON out of XML (yes, I heard of Badgerfish, but if you start your design from JSON and then add XML, you don't need Badgerfish).
JSON is also less verbose, which saves you the traffic, but really it's not the main reason to use it. The main reason is SIMPLICITY!

Now a little implementation note for those of you, who use Java: Create the classes that you want to be sent/received by the resources as POJOs. Keep them simple and prefer using String as a major data type. If you need something complex, like Date, do the formatting yourself and describe it in your documentation (SimpleDateFormat will help you). Once you have a set of simple POJOs, use Jackson to create/parse JSON and JAXB (integrated in Java 6+) to create/parse XML. Jackson comes with JAX-RS Providers. JAXB Providers are already included by the JAX-RS implementations. Thus in one shot you'll get both JSON and XML representations for your resources.

Back to the representations. And what about if you cannot use JSON? Of course such a thing can happen. For example you may want to upload image and there is no need to try to fit it in JSON. It's fine. Use whatever representation you want, but AVOID CUSTOM REPRESENTATIONS! If you feel that you must invent some new serialization mechanism that was not invented before, think twice. Are you really so unique? Really? Remember, you may have different clients. Each of them will have to know your serialization mechanism, will it be simple for them? Or you'll need to create a custom library. Remember: there are a lot of different languages, technologies and platforms, and you may need to support all of them.

Summary

1. Design your APIs to send/receive JSON.
2. So if you need XML, it can be easily added without writing a single line of code (at least with JAX-RS)
3. Need to send/receive something binary? Fine. But make your best to use standard formats and don't reinvent the wheel (until you feel that it's a must)
4. Never use Java standard serialization mechanism. If you need to serialize the Java class, read again #1 and use JSON ;)

REST Best Practices: Create a good URL

REST is not a standard, therefore you are free to choose how to use it. It's very hard to say, if you do something "100% right" or "100% wrong". And still there are good RESTful APIs and bad RESTful APIs. Probably the most important part of your RESTful API are the URLs. They identify your resources. Good designed URLs make your API look good.

And here come some practices that I personally believe are best (or just good):

1. URL must uniquely identify the resource - it should be impossible using the same URL to access two different resources based on something else (e.g. header).
Example: suppose we design a RESTful API for a library. Let's say the url /books/ABC returns Winnie the Pooh for a registered user. A very BAD practice would be, if the unregistered user will get a different book for the same url. It should not matter who is the user, the same url should lead to the same book.
Now, of course our application may have security implications, so for example registered user can see the book and unregistered cannot. It's ok, return 404 NOT FOUND for unregistered user. Or eliminate the book from the book search. But NEVER return a different resource.

2. URL should be designed for further API changes - this one is kinda tricky. Suppose a registered user can add some books to a favorite list. What URL will be used?
Option 1: "/user/{userid}/favorites/" will return favorites list of a user. Sounds reasonable. But what happens if we decide to extend this feature and allow a user have multiple list, what will return this url now?
Option 2: "/user/{userid}/favorites/" will return a list of the favorites lists and "/user/{userid}/favorites/{id}" will return the specific list. Sounds reasonable. But what happens if we add a feature that allows user to share lists? Actually here we got to the point I'll discuss in the next bullet, but it's quite clear here that "/user/{userid}" should not be part of the url, right?
Option 3: "/favorites/" will return a list of the favorites lists and "/favorites/{id}" will return the specific list. Both "/favorites/" and "/favorites/{id}" resources will return the lists based on the user's privileges: the system administrator will see all lists, the user will see his lists the ones that were shared with him.

3. Security is not part of the resource identification and therefore should not be a part of the URL - in the previous bullet I have already described why adding a userid to the url is problematic. Let's talk about it a little bit more. First, you cannot rely "userid" present in the url in order to identify the user. You need a different form of authentication (unless you use username+password authentication and you put a password also in url, it's possible, but it's really a VERY BAD practice). Furthermore, the good api should not contain the definition of an authentication method within it at all. It should be possible to change the authentication method without changing the API (for example it quite common in the last few years to move from username+password to OAuth authentication). Basically the API should expect to receive the userid always, but not as an integral part of an API! With HTTP it's quite easy to put the security related stuff in the headers, and to keep the url clean.

4. If you must put additional metadata on URL, put it as a query parameter - In general all metadata you have (like security, content-type, accept content-type) should be in headers. However, sometimes it for some technical reasons it become impossible to put it in a header, so you must use URL. It's ok, but put it after the question mark in the query parameters part. Thus it becomes "optional" part and can be easily changed or removed if not needed later.

Tuesday, April 10, 2012

Why REST? Or "Simple vs. Easy"

This is a post that I want to write for years and cannot find the correct words. I'm fully aware that hundreds of blog posts have already fully covered this subject and still I want to write my opinion in my blog. It is my blog after all, right?

So "Why REST?"
But first, I'd like to say two words. One is "simple" and the second one is "easy". Although these words sound like synonyms, actually they are not. Some simple things can be not easy. And some easy things can be not simple.
For example: breathing is easy for most people, you don't need to think about it. But is it simple? Can you make someone breathe, if he doesn't?
On contrary cleaning an apartment is simple. But is it easy? Not for me at least...

The key point here is that simple things just work. They may not be easy to achieve, but once you do it, they work. Once you clean an apartment, you have a clean apartment. For a while at least.
But when something is not simple, its quite problematic even to understand how it should work. Sometimes it's easy to start (breathe for example), but what happen if something goes wrong (stop breathe for example)?

And now back to REST.
REST is simple. Not easy, but simple. Correct RESTful APIs are very clear, very simple to understand. They may not be easy to implement, but once implemented they work. It's quite easy to troubleshoot the RESTful APIs, you can do with with a simple HTTP Proxy (e.g. Fiddler). It ensures quite a decoupling between server and client.
Yes it takes time to develop a RESTful API. Yes there is no much automation around and suddenly a developer needs to develop a layer that he got generated with SOAP.

And now to SOAP (and other APIs that are generated from wslds/custom xmls/whatever): this one is easy. Usually I can get some working API very fast. There are a lot of tools that help you. BUT, it is not simple: the tools generate a lot of code that you don't know and sometimes not very readable. The integrations become very complex (did you ever install ESB?) The versioning becomes a nightmare. And the standards don't work very well (did you ever tried to integrate Java to .NET using some complex types?)

So the bottom line is: REST is simple, but not easy. SOAP is easy at the beginning, but not simple.
Simple is good, specially if you continue keeping it simple.
And forget about easy. Nobody said that a software development should be easy.

Wednesday, April 4, 2012

Changing Putty Default Settings

It's trivial, I should have figure it out myself, but I suffered for many years until finally I googled for it, and voila: To save a default setting in putty, open putty, change the setting, in "Saved Sessions" choose "Default Settings" and click "Save".