Monday, May 7, 2012

REST Best Practices: Use HTTP Status Codes

When implementing RESTful service, keep in mind that HTTP already provides you with an ability to send the status code as part of the protocol. Do NOT put the error code inside the message itself!

HTTP defines five types of status codes:
* 1xx - Informational
* 2xx - Successful
* 3xx - Redirection
* 4xx - Client Errors
* 5xx - Server Errors

See Full reference of HTTP/1.1 Status Code Definitions

I'm not going to describe all HTTP Status codes here, but to give some basic tips.

1. In case of a successful request, always return the status code from the 2xx group. It's highly recommended to use not only the basic 200 code, but also additional codes. For example 201 means a new resource was created and the LOCATION header contains its path. 204 means that the response contains no content, so the client can optimize its code and not even try to read the content of the response (saves some redundant objects creation).

2. In case of an error request, clearly distinguish between the client and server errors. Client errors mean that the client has sent a bad request: it may be incorrectly formatted, unauthorized, method not allowed (e.g. Server accepts GET requests, while POST has been sent), and more.
In case of a client error, the best is to detect it as soon as possible in the server's code, to reduce amount of logic that will run, until the request is rejected. For example it's redundant to parse request body, if it misses the security header.
2.1 Send clear response to client. It can be a good practice to supply text or html content inside the message telling what was wrong in the request.
2.2 Don't log requests with a client's error above the info level. After all, it's a client's error, not server's. The only reason to log client' errors at all, is to help clients by taking look at the server logs. Make sure not to create a misleading picture of a log full with exceptions, while there is no problem at server side at all.
2.3 By receiving the 4xx code, the client must understand that it did something wrong and should correct its request.

3. In case of a server error, return 5xx error code. Usually the best choice will be to return simply 500. Do NOT put the full stacktrace into the response body. Actually put nothing in the response body. Why should a client care about the reason of server's failure. Is the database down? Is it a code problem? Whatever it is, it's not the client's business.
3.1 Sometimes, it can be nice to return 501 Not Implemented code. Usually it will happen, if you agreed on some functionality, created a prototype for it but didn't implemented it yet.
3.2 By receiving the 5xx code, the client must understand that something went wrong and that it should retry the same request. Does it make sense to retry immediately, in 5 minutes or in 5 days depends on the client. The server can return the *Retry-After* header and client should respect it.

There are more status code that can and should be used with RESTful APIs. For example 304 Not Modified allows you to save traffic and skip the response if the resource was not modified. Many status codes are implemented by the frameworks and intermediaries, for example JAX-RS frameworks will automatically return some client errors, like 405 and 415.

Summary

1. In case of successful request, always return a request from 2xx group.
2. In case of unsuccessful request, never return 2xx with embedded status code in the message. Return appropriate 4xx or 5xx status code.
3. For client errors, return 4xx. Don't log these requests above the info level. These are client's problems, not server's.
4. For server errors, return 5xx. Log the error, but don't send it to client.

Thursday, May 3, 2012

Promoting tarlog-plugins

In the last four years, since I have first published the tarlog-plugins Eclipse plugin, it has 4600+ downloads.
I know that many readers of this blog are actually using it.
I won't ask you why none of you has never clicked the Donate button. After all I have never did it myself...
But why you won't you share it?
You can star and/or +1 it on the project's page.
You can favorite it on the marketplace.
You can... well, the previous two are enough. But if you really want to Donate, there is a Donate button on the project's page.

P.S. And there is also an Encoder Tool. It didn't become as popular as tarlog-plugins, but it still nice for small encoding/decoding tasks.

Monday, April 16, 2012

REST Best Practices: Using HTTP Verbs

The common mistake about RESTful API that they MUST be CRUD-like: Create maps to POST. Read maps to GET. Update maps to PUT. Delete maps to DELETE.

In fact this is incorrect. They MAY be CRUD-like or MAY be something else. The rule is: each resource MUST have a set of predefined operation. But not necessary CRUD. Even not necessary GET-POST-PUT-DELETE should be used. In fact did you know that HTTP/1.1 defines 8 HTTP Verbs: HEAD, GET, POST, PUT, DELETE, OPTIONS, TRACE and CONNECT? And in addition you may define your own HTTP Verbs that will behave as you wish? For example WebDAV defines the following verbs in addition to the standard ones: PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK.

So now, when you know that you may not stick to the standard HTTP verbs, I still advocate you to do so. Why? It's relatively clear what action these methods will do with the resource. It's may be less clear what action your own verb will do.

So let's consider that 90% of RESTful APIs will use only the standard HTTP verbs. What's important to pay attention to? The most important rule is to use the HTTP verbs correctly. The most common example of an incorrect use of the GET verb can be taken from HTML forms: in HTML both GET and POST can be used to submit data to a server from a form. But actually GET is a safe method, it means that it SHOULD NOT have the significance of taking an action other than retrieval.

So remember, Rule 1: HEAD and GET are safe methods.

Now about the Idempotent Methods. Methods are idempotent if the side-effects of N > 0 identical requests is the same as for a single request. The idempotent methods are GET, HEAD, PUT and DELETE.
Some examples what does it mean:
1. PUT can be used to create a resource. Yes, I know that the best practice say that it is POST, who creates resources and in a second I'll explain why. So let's say PUT creates a resource, but if a request is duplicated for some reason, the second request must not create a new resource and return exactly the same response as the first request. It's important to understand that various HTTP intermediaries know that PUT is idempotent and for example they may apply some caching, etc. So actually POST is much safer, especially for creating resources. But pay attention: POST may be safely used also for updates and even fetching the resource! Nobody said it is only for creation.
2. When DELETE-ing the resource, pay attention that the second DELETE request should return the same response. So if a first response returns "200 OK", the second (third, fourth, etc) DELETE requests to the same resource also should return "200 OK". Pay attention that once the resource was deleted, the GET request to this resource should return "404 NOT FOUND", but DELETE should continue returning "200 OK".

So Rule 2 will be: use idempotent methods correctly. If you cannot ensure that you use them correctly, use other methods and POST can be a good choice.

Now a little about the methods that you are unlikely to implement yourself. The behavior of the HEAD, TRACE, CONNECT and OPTIONS is well defined and they are usually implemented by the infrastructure (CONNECT and TRACE by the web servers, HEAD and OPTIONS by the frameworks, like Servlets or JAX-RS). If, for some reason, you decide to implement one of these methods, make sure your do it correctly.

Summary

1. Don't try to map CRUD to HTTP verbs. There is no need to do it.
2. Create new HTTP verbs, if needed.
3. Use existing HTTP verbs correctly, pay special attention to the safe methods (HEAD and GET) and to the idempotent methods (HEAD, GET, PUT and DELETE).
4. If you are now sure which HTTP verb to choose, use POST.

Wednesday, April 11, 2012

REST Best Practices: Use JSON

Something that RESTful design does not care about, is the actual representation. The concept say that the resource can be represented in one representation or more. It does not affect the resource itself. The JAX-RS solved it very beautifully with a clear separation between the Resources and Providers.

And then happens a real life, and you need to implement a resource and represent it somehow. The native choice of the programmers who come to the RESTful world from SOAP is XML.
There are several reasons for it, and the main one: they were used to xml and why change something that works? Or at least worked?
Wrong! You don't need to stick to something you have used to use. You can use something better. And JSON is better. Why? The main reason: it's more simple - it has no namespaces and no attributes. It's less to write, less places to make a mistake in.
The compatibility issues with JSON are solved better, since you don't have xsds and you cannot fail with the validations. The new fields will be silently ignored by the old version and that's all.
You can always create XML out of JSON in case your client prefer to get XML. But it's not so easy to create JSON out of XML (yes, I heard of Badgerfish, but if you start your design from JSON and then add XML, you don't need Badgerfish).
JSON is also less verbose, which saves you the traffic, but really it's not the main reason to use it. The main reason is SIMPLICITY!

Now a little implementation note for those of you, who use Java: Create the classes that you want to be sent/received by the resources as POJOs. Keep them simple and prefer using String as a major data type. If you need something complex, like Date, do the formatting yourself and describe it in your documentation (SimpleDateFormat will help you). Once you have a set of simple POJOs, use Jackson to create/parse JSON and JAXB (integrated in Java 6+) to create/parse XML. Jackson comes with JAX-RS Providers. JAXB Providers are already included by the JAX-RS implementations. Thus in one shot you'll get both JSON and XML representations for your resources.

Back to the representations. And what about if you cannot use JSON? Of course such a thing can happen. For example you may want to upload image and there is no need to try to fit it in JSON. It's fine. Use whatever representation you want, but AVOID CUSTOM REPRESENTATIONS! If you feel that you must invent some new serialization mechanism that was not invented before, think twice. Are you really so unique? Really? Remember, you may have different clients. Each of them will have to know your serialization mechanism, will it be simple for them? Or you'll need to create a custom library. Remember: there are a lot of different languages, technologies and platforms, and you may need to support all of them.

Summary

1. Design your APIs to send/receive JSON.
2. So if you need XML, it can be easily added without writing a single line of code (at least with JAX-RS)
3. Need to send/receive something binary? Fine. But make your best to use standard formats and don't reinvent the wheel (until you feel that it's a must)
4. Never use Java standard serialization mechanism. If you need to serialize the Java class, read again #1 and use JSON ;)

REST Best Practices: Create a good URL

REST is not a standard, therefore you are free to choose how to use it. It's very hard to say, if you do something "100% right" or "100% wrong". And still there are good RESTful APIs and bad RESTful APIs. Probably the most important part of your RESTful API are the URLs. They identify your resources. Good designed URLs make your API look good.

And here come some practices that I personally believe are best (or just good):

1. URL must uniquely identify the resource - it should be impossible using the same URL to access two different resources based on something else (e.g. header).
Example: suppose we design a RESTful API for a library. Let's say the url /books/ABC returns Winnie the Pooh for a registered user. A very BAD practice would be, if the unregistered user will get a different book for the same url. It should not matter who is the user, the same url should lead to the same book.
Now, of course our application may have security implications, so for example registered user can see the book and unregistered cannot. It's ok, return 404 NOT FOUND for unregistered user. Or eliminate the book from the book search. But NEVER return a different resource.

2. URL should be designed for further API changes - this one is kinda tricky. Suppose a registered user can add some books to a favorite list. What URL will be used?
Option 1: "/user/{userid}/favorites/" will return favorites list of a user. Sounds reasonable. But what happens if we decide to extend this feature and allow a user have multiple list, what will return this url now?
Option 2: "/user/{userid}/favorites/" will return a list of the favorites lists and "/user/{userid}/favorites/{id}" will return the specific list. Sounds reasonable. But what happens if we add a feature that allows user to share lists? Actually here we got to the point I'll discuss in the next bullet, but it's quite clear here that "/user/{userid}" should not be part of the url, right?
Option 3: "/favorites/" will return a list of the favorites lists and "/favorites/{id}" will return the specific list. Both "/favorites/" and "/favorites/{id}" resources will return the lists based on the user's privileges: the system administrator will see all lists, the user will see his lists the ones that were shared with him.

3. Security is not part of the resource identification and therefore should not be a part of the URL - in the previous bullet I have already described why adding a userid to the url is problematic. Let's talk about it a little bit more. First, you cannot rely "userid" present in the url in order to identify the user. You need a different form of authentication (unless you use username+password authentication and you put a password also in url, it's possible, but it's really a VERY BAD practice). Furthermore, the good api should not contain the definition of an authentication method within it at all. It should be possible to change the authentication method without changing the API (for example it quite common in the last few years to move from username+password to OAuth authentication). Basically the API should expect to receive the userid always, but not as an integral part of an API! With HTTP it's quite easy to put the security related stuff in the headers, and to keep the url clean.

4. If you must put additional metadata on URL, put it as a query parameter - In general all metadata you have (like security, content-type, accept content-type) should be in headers. However, sometimes it for some technical reasons it become impossible to put it in a header, so you must use URL. It's ok, but put it after the question mark in the query parameters part. Thus it becomes "optional" part and can be easily changed or removed if not needed later.

Tuesday, April 10, 2012

Why REST? Or "Simple vs. Easy"

This is a post that I want to write for years and cannot find the correct words. I'm fully aware that hundreds of blog posts have already fully covered this subject and still I want to write my opinion in my blog. It is my blog after all, right?

So "Why REST?"
But first, I'd like to say two words. One is "simple" and the second one is "easy". Although these words sound like synonyms, actually they are not. Some simple things can be not easy. And some easy things can be not simple.
For example: breathing is easy for most people, you don't need to think about it. But is it simple? Can you make someone breathe, if he doesn't?
On contrary cleaning an apartment is simple. But is it easy? Not for me at least...

The key point here is that simple things just work. They may not be easy to achieve, but once you do it, they work. Once you clean an apartment, you have a clean apartment. For a while at least.
But when something is not simple, its quite problematic even to understand how it should work. Sometimes it's easy to start (breathe for example), but what happen if something goes wrong (stop breathe for example)?

And now back to REST.
REST is simple. Not easy, but simple. Correct RESTful APIs are very clear, very simple to understand. They may not be easy to implement, but once implemented they work. It's quite easy to troubleshoot the RESTful APIs, you can do with with a simple HTTP Proxy (e.g. Fiddler). It ensures quite a decoupling between server and client.
Yes it takes time to develop a RESTful API. Yes there is no much automation around and suddenly a developer needs to develop a layer that he got generated with SOAP.

And now to SOAP (and other APIs that are generated from wslds/custom xmls/whatever): this one is easy. Usually I can get some working API very fast. There are a lot of tools that help you. BUT, it is not simple: the tools generate a lot of code that you don't know and sometimes not very readable. The integrations become very complex (did you ever install ESB?) The versioning becomes a nightmare. And the standards don't work very well (did you ever tried to integrate Java to .NET using some complex types?)

So the bottom line is: REST is simple, but not easy. SOAP is easy at the beginning, but not simple.
Simple is good, specially if you continue keeping it simple.
And forget about easy. Nobody said that a software development should be easy.

Wednesday, April 4, 2012

Changing Putty Default Settings

It's trivial, I should have figure it out myself, but I suffered for many years until finally I googled for it, and voila: To save a default setting in putty, open putty, change the setting, in "Saved Sessions" choose "Default Settings" and click "Save".

Thursday, December 8, 2011

Integration Testing of RESTful Application

In the previous posts I have described how to start Jetty from code in the beginning of the unit tests and how to initialize the in-memory HSQL database once Jetty is started.

If you have completed the steps from these posts, you should have a running application in the beginning of your tests. And now you are ready to start the actual testing of the application.

I believe that the best way to test RESTful API is to issue actual HTTP requests and since we have a Jetty server running, it becomes possible.

There are a lot of HTTP Clients available in Java. The examples below use Apache HTTP Client.
But as always first let's add a maven dependency:

<dependency>
         <groupId>org.apache.httpcomponents</groupId>
         <artifactId>httpclient</artifactId>
         <scope>test</scope>
        </dependency>

Now some convenient static methods that can be used:

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpDelete;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.methods.HttpPut;
import org.apache.http.entity.ByteArrayEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.AbstractHttpMessage;
import org.apache.http.message.BasicHeader;
...
   static HttpResponse executeGet(String url) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpGet get = new HttpGet(url);
        return httpclient.execute(get);
    }

    static HttpResponse executeDelete(String url) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpDelete delete = new HttpDelete(url);
        return httpclient.execute(delete);
    }

    static HttpResponse executePost(String url, byte[] body) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpPost post = new HttpPost(url);
        post.setEntity(new ByteArrayEntity(body));
        return httpclient.execute(post);
    }

    static HttpResponse executePut(String url, byte[] body) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpPut put = new HttpPut(url);
        put.setEntity(new ByteArrayEntity(body));
        return httpclient.execute(put);
    }

So your test can be something like:

@Test
    public void testGet() throws Exception {
        HttpResponse response = executeGet(url, null, null, null);
        assertEquals(response.getStatusLine().getStatusCode(), 200);
    }


Recommended Reading

1. Next Generation Java Testing: TestNG and Advanced Concepts
2. Apache Maven 3 Cookbook
3. Spring Recipes: A Problem-Solution Approach

Automated Integration Tests Using with Jetty, Maven and Other Neat Freameworks - 2

In the previous post, I have started a Jetty server in the beginning of a unit test with a configured external data-source.

Let's talk about it a little bit more. The assumption here was that the application on a regular basis uses an external data-source that is accessed via JNDI. In general it's a good practice to keep a data-source external to the application:
1. It's always possible to change the data-source without touching the application - let's say a bug was found in a data-source you are using. Or it should be configured differently. If a data-source embedded into an application and such a change is required, you will probably need to release patch. If the datasource is external, it would be enough just to change/reconfigure it.
2. In some deployments several applications can use the same datasource. Consider a Tomcat running with several wars: quite a common case, right? If a datasoure is embedded, each war has its own data-source. Usually this would be a waste of resources and duplication of a configuration.
3. And finally it would be possible to change a data-source for unit tests, which is exactly our use case.

For unit tests it's really convenient to use in-memory database. There are few databases implemented in java that provides this capability. My favorite are H2 and HSQL. In this example I have used HSQL, so here comes the jetty-ds-test.xml:

<Configure id="Server" class="org.eclipse.jetty.server.Server">
    <New id="DSTest" class="org.eclipse.jetty.plus.jndi.Resource">
        <Arg></Arg>
        <Arg>jdbc/my_ds</Arg>
        <Arg>
            <New class="org.hsqldb.jdbc.JDBCDataSource">
                <Set name="Url">jdbc:hsqldb:mem:test;sql.syntax_ora=true</Set>
                <Set name="User">sa</Set>
                <Set name="Password">sa</Set>
            </New>
        </Arg>
    </New>
</Configure>

This is configuration of HSQL in memory. Notice sql.syntax_ora=true, which makes HSQL to use Oracle syntax. This is useful if you are using Oracle in production, but you want to use HSQL for unit testing or development.

So now you have at the beginning of a unit-test you have a Jetty server running with your application connected to in-memory HSQL database via JNDI. But something is still missing. This something is a database schema: to remind you, we have just started a new database in memory and it's empty. You probably already have a script that generates schema, tables, indexes and may be some data. Now you need to run it.

Actually there are several options how do it. One of the easiest is to use Springs's SimpleJdbcTestUtils. But first we'll need to add a dependency on spring-test in our pom.xml:

<dependency>
         <groupId>org.springframework</groupId>
         <artifactId>spring-test</artifactId>
         <scope>test</scope>
         <version>${spring-version}</version>
</dependency>

And here comes the code that runs the sql script:
import javax.naming.InitialContext;
import javax.naming.NamingException;
import javax.sql.DataSource;
import org.springframework.core.io.ClassPathResource;
import org.springframework.core.io.Resource;
import org.springframework.jdbc.core.simple.SimpleJdbcTemplate;
import org.springframework.test.jdbc.SimpleJdbcTestUtils;
...
    @BeforeClass(dependsOnMethods = { "startJetty" })
    public void initiateDatabase() throws Exception {       
        InitialContext initialContext = new InitialContext();
        DataSource ds = (DataSource) initialContext.lookup("jdbc/my_ds");
        SimpleJdbcTemplate simpleJdbcTemplate = new SimpleJdbcTemplate(ds);
        Resource resource = new ClassPathResource(sqlScriptFileName);
        SimpleJdbcTestUtils.executeSqlScript(simpleJdbcTemplate, resource, false);
    }
In this snippet I load the sqlScriptFileName from the classpath. Usually it's convenient to place the script in src/test/resources, but if you don't like it, you can always load it from a different place by using other Resource implementations (e.g. URLResource is quite convenient).

As I have already said in this snippet I used Spring. If you are familiar with Spring, it is probably natural to you. If you don't - don't be afraid. Only the unit tests become dependent on Spring, but the actual application did not.

And now you are ready to start the testing.


Recommended Reading

1. Next Generation Java Testing: TestNG and Advanced Concepts
2. Apache Maven 3 Cookbook
3. Spring Recipes: A Problem-Solution Approach

Automated Integration Tests Using with Jetty, Maven and Other Neat Freameworks

Let's say that you have a web application (aka war) that exposes RESTful API and connects to a database. Now you want to do some automation tests (and I'm not going to describe here why actually you must have automation tests that run regularly on your API).

One of the best solutions to do it, in my opinion, is running your application on a Jetty server that is embedded in your unit test with some in-memory database.

But let's do it step-by-step:

Step1: Start Jetty from a Unit Test

(In this guide I have used TestNG, but I see no reason why the same functionality cannot be achieved in JUnit)

So first we need to start Jetty in our test. But before we actually do that, we need to make sure that our Maven project contains the relevant dependencies. So first we need Jetty:
<dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-server</artifactId>
            <version>${jetty-version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-webapp</artifactId>
            <version>${jetty-version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-jndi</artifactId>
            <version>${jetty-version}</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.eclipse.jetty</groupId>
            <artifactId>jetty-plus</artifactId>
            <version>${jetty-version}</version>
            <scope>test</scope>
        </dependency>

The Jetty dependencies in this guide already contains the JNDI support. It will be needed later. But if JNDI support is not required, they can be omitted.

And then let's start it before the tests start:

import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.webapp.WebAppContext;
import org.eclipse.jetty.xml.XmlConfiguration;
import org.testng.annotations.BeforeClass;

public class MyTest {

    private static final String RESOURCES_URL = "/rs";
    private static final String CONTEXT       = "/app_context";
    private static final String DS_CONFIG     = "/jetty-ds-test.xml";
    private String              baseResourceUrl;

    @BeforeClass
    public void startJetty() throws Exception {
        Server server = new Server(0);   // see notice 1
        server.setHandler(new WebAppContext("src/main/webapp", CONTEXT)); // see notice 2

        // see notice 3
        InputStream jettyConfFile = InboxTest.class.getResourceAsStream(DS_CONFIG);
        if (jettyConfFile == null) {
            throw new FileNotFoundException(DS_CONFIG);
        }
        XmlConfiguration config = new XmlConfiguration(jettyConfFile);
        config.configure(server);

        server.start();
        
        // see notice 1
        int actualPort = server.getConnectors()[0].getLocalPort();
        baseResourceUrl = "http://localhost:" + actualPort + CONTEXT + RESOURCES_URL;
    }
Please notice that:
1. Jetty is started on a random port. The actual url with the actual port is saved to baseResourceUrl to be used later by tests.
2. Web application context points to maven's src/main/webapp.
3. Jetty is started with a data source configuration. (See Runnig Jetty from Maven using JNDI Data Source)

Part 2


Recommended Reading

1. Next Generation Java Testing: TestNG and Advanced Concepts
2. Apache Maven 3 Cookbook
3. Spring Recipes: A Problem-Solution Approach