Tarlog on Java

Setting environment variable from code in Java

2018-10-13T19:34:00.000+03:00

Normally the environment variables are set, as the name implies, but the environment. However, sometimes a code runs as part of some environment that you cannot control.
Many times it happens as part of integration or performance tests.

And here's the way to overcome it:

Why you should be careful when using Lombok or other code generation tools

2017-11-15T20:29:00.001+02:00

Look at the following code:

Will the testHashCode succeed or fail?

The answer it will fail, since the second assert: assertTrue(set.contains(myEntity)) will not find myEntity in the HashSet.
But it's there, right? It was never removed.

So what we have here is: 1. Business problem, since it's impossible to get object from set that is there. 2. Memory leak.

But how did it happen?

The problem is with Lombok's @Data annotation. It's a very convenient annotation that auto-generates all methods the java utility methods: equals, hashCode, toString as well as relevant constructors and getters.
Yes, it's very convenient, but a hidden problem is introduced: equals and hashCode include all fields and when value of a field changes, the hashCode returns a different value. Therefore, the object cannot be found in HashSet anymore.

This problem is not unique to Lombok. Exactly the same problem will occur if you write the method yourself by using mutable fields or if you use any other code-generation or reflection tools.
However, if you write code yourself, it's a bit more visible, while with Lombok it's kind of woodoo.

The best practices here are not related to Lombok or other library and are quite simple:
1. As much as possible try to make your class immutable.
2. Even if a class is mutable, are all fields mutable? Use only immutable fields in hashCode and equals and you will be safe.
3. If a class is completely mutable and you cannot rely on some immutable fields, reconsider if you need to override hashCode and equals at all. Is default implementation sufficient?
4. If none of above doesn't work for you - document. Put a HUGE WARNING in javadoc explaining why the users of the class must be careful, if they decide to store the instances in HashSet or as a key in a HashMap.

Running arbitrary command when using auto-completion in Zsh

2017-10-20T20:52:00.002+03:00

Did you ever wanted to run an predefined command as auto-completion in zsh?
For example, you may have a script that accepts only predefined values as input.
Actually it's possible and quite easy.

First, you will need a file that will be invoked to auto-complete the command.
Here's an example:

You can replace the HERE_COMES_SHELL_COMMAND with any command. For example it can be "cat ~/myfile" to read options from a file.
#compdef defines the list of commands this file will auto-complete. In the above example it's going to be mycommand, change it to your actual command.
As I already mentioned, it can be also a list: #compdef mycommand myscript myprogram - will autocomplete any of the mycommand, myscript, myprogram.

Now, you need to tell zsh about your file:
1. Place your file to some directory. For example: ~/.myautocomplete
2. In your ~/.zshrc add the following line: fpath=(~/.myautocomplete $fpath)
3. After this line add the following lines:
autoload -U compinit
compinit

On MAC I used the following instead:
autoload -U compaudit compinit

4. You may need to delete files that start with .zcompdump in your home directory.
5. Start a new shell, and it should work.

If you are using oh-my-zsh, it's a bit easier:
1. Create your directory under ~/.oh-my-zsh/plugins
For example ~/.oh-my-zsh/plugins/myautocomplete
2. Place this file in this directory.
3. Edit ~/.zshrc, find "plugins" and add "myautocomplete" to the list.
4. Start a new shell.

How do I avoid the error "Unable to validate the following destination configurations" when using S3 event notifications in CloudFormation?

2017-10-10T01:46:00.001+03:00

There is a very important post about avoiding the "Unable to validate the following destination configurations" in AWS.
Too bad it's not mentioned just next to both S3 and SNS/SQS reference documentation.

BUT! This post is lacking some important part: You will get this error even if you didn't specify the TopicPolicy (or QueuePolicy) at all!
Furthermore, you will get this error even if you specified the policy, but it's not correct.
For example, if your policy is too restrictive and S3 would not be able to send events to SNS, you will also get this error! Is it clear from error's description? Not really. Is it clear from the above's AWS post? No, not at all.

So just remember, when you see "Unable to validate the following destination configurations" - check the Policy. It may be lacking. It may be too permissive or too restrictive, but the problem is with the policy, and not with a bucket.

CloudFormation Tips

2017-10-04T23:43:00.002+03:00

Some tips of using the CloudFormation:

1. Don't specify a resource name, unless absolutely must doing so. This way you can avoid names clashes, since CloudFormation will automatically assign unique names to your resources.
2. If you need to specify a name, include the stack name in it. This way you will reduce the potential name clashes. You can also include a partition and region for resources that are available globally (e.g. S3 bucket names). Note that this will NOT prevent the potential naming clash completely, since somebody else can also use the same name.
3. When creating any IAM resources in your stack, make sure to add DependOn in the resources that use these IAM resources. Apparently CloudFormation is not smart enough to resolve this dependency tree and handle it without additional configuration.
4. Sometimes the names the CloudFormation will give to your resources is completely unrelated to the stack name. Include the ARN of such resources in the Outputs, so you can easily find them later, when needed.
5. Very common scenario in AWS is a S3 bucket that fires events to SNS or SQS, when a file is uploaded. Apparently it's impossible to create it in single change. See this post.

See also:

Print Gradle Dependencies

2016-04-10T13:34:00.001+03:00

One way to print the project's Gradle dependencies is 'gradle dependencyReport'.
However it creates a very large file with many scopes that are sometimes hard to track.
Sometimes it can be useful just to print the list of dependencies of a specific scope.
A very small script can do the job, and here are some examples:

Dropwizard: Add thread name to log

2015-11-23T17:53:00.001+02:00

This should have been trivial, but somehow it isn't.
So I'll put it here.
Adding the thread name to Dropwizard logs:

Unix Shell: Use of functions to create complicated aliases

2015-11-23T17:02:00.001+02:00

In *nix shells it sometimes useful to create aliases that receive parameters.
This can be done using functions:

Now you can type just something like: kssh 192.168.1.1 or pssh 192.168.1.2

Pretty Format of JSON in vim

2015-01-18T17:12:00.000+02:00

Open ~/.vimrc
Add the following line:

command Json :%!python -m json.tool

To format Json, type :Json
Note: you need Python installed.

Inspired by this post.

Monitor Java on Unix/Linux/Solaris

2014-05-14T10:18:00.001+03:00

Just a short memo of useful commands that can help to troubleshoot Java on *nix:
(Btw, most of them will also work on Windows, but who runs Java on Windows? Just kidding...)

jps -m - show running Java processes and their pids.
jstack <pid> - print thread dump
jmap -dump:format=b,file=<path to file> <pid> - save heap dump to file

UnresolvedAddressException Tip

2013-08-01T15:28:00.000+03:00

Getting java.nio.channels.UnresolvedAddressException?
Having no idea why does this happen?

Check the code that creates the address. Did you use java.net.InetSocketAddress.createUnresolved(String, int) to create it?
Do NOT! Just new java.net.InetSocketAddress and it should be fixed.

P.S. This is kind of a post I write here after spending hours on a stupid bug.
So people can google it out and spend less time on it.

DevOps: Making Fast Deployments of Java Servers using Maven and Nexus

2013-05-21T10:26:00.000+03:00

A Warning: this post is theoretical. I have never tried something like this yet. Maybe I will try it in the future. But currently it's just a nice idea.
In addition, if you know about somebody who works in a similar way, I would really like to know. So please comment!

If you provide a SAAS service you probably have multiple Java servers running in some sort of a cluster. If your SAAS solution is complicated, and if your solution is multi-tier, you should have multiple servers types. And now comes a question: How to make quick deployments to the production?

The common solution suggests that you build a package and release it. It might be a war, or a zip, or a rpm if you are running on Linux.
Once released, you upload the package to the server, unzip/copy it to the relevant folder and restart the server.

The problem with this solution might be if your packages are large. (And if you are using OSGi, your packages are usually very large!) So the upload itself takes time. It also uses traffic which might become expensive if you perform a lot of deployments. And the really funny thing is that most of the upload is redundant: most of the jars in your package are third parties that do not change between the deployments at all!

The common solution suggests pre-uploading the third party jars to the server and exclude them from the package. I've seen such a solutions and in my opinion they are the exact opposite of a good solution: in this way you split the package, the third parties become manages in two (sometimes more) places and each deployment involves at least additional (probably manual!) step of checking if the third parties were changed and if additional deployment if third parties is required.

But if you use Maven. And if you upload your released packages to Nexus (or actually any other Maven repository). This Nexus repository contains all the third parties, all the released packages and the most important: The pom file that was used to build your project!
If you download this pom file, you will be able to build the package on the production server! Pay attention that you don't need to do the full build that includes the compilation, testing and so on. You just need your package, so considering that you deploy a war, you only need to run the "mvn war:war" (Once again: I never tried it myself and the actual execution might be more complex, but I think that the idea is clear).
Sometimes, if you a running a java application with a main class (pure old java and not some kind of JEE inside the application server or a servlet container), you don't even need a package, you just need a correct classpath and Maven will be happy to assist you: mvn dependency:build-classpath.

So I guess that the idea is clear now. Each time Maven will download only the relevant jars and save them to the local repository. The dependencies are managed in the same pom file that is used to release the application, so when making a package, or creating a classpath on a production machine, the exact same dependencies will be used.
And the deployment process will become much faster!

I know that this idea is somewhat different from the usual process. Instead of doing some like "build, deploy, run", we do something that might look even more complicated: "build, deploy descriptor only, package, run". But this should be much faster. So I definitely think that this idea is worth trying.

P.S. The idea described in this post relates only to the package itself: building, packaging and running. The deployment may contains additional steps like changing the local configuration files and so on. These steps are not covered here as they are usually not covered in a build process, but part of release notes. The possible solution can be deploying the relevant scripts to Nexus repository and somehow describe them in a pom file. When downloading the pom, the relevant scripts will be also downloaded and executed.

P.P.S. The idea also doesn't cover the tool that makes the whole process. Although it describes that the tool is using Maven, it says nothing about the actual implementation. It might be a java process. Or a shell script. Or even Ant.

P.P.P.S. Notice that downloading files from Nexus using Maven makes important checks for you, for example it makes an integrity check, which is very important in case of a bad network between the Nexus with releases and a production site.
In addition, you can make some optimizations on Nexus. For example, if you have several production sites all over the world, each site may have a Nexus pointing to the main release repository and caching it. This will make the deployments even faster.

Deadlock in Jetty or Be Careful while Synchronizing

2013-02-13T16:53:00.001+02:00

About nine months ago I reported a bug to the Jetty community that session timeout doesn't work properly. The bug was fixed quite quickly, but nine months later I have discovered that the fix leads to a deadlock in some scenarios.

Deadlock in Jetty illustrates an interesting coding guideline that you must follow while writing your code.

So what happened in Jetty?

Consider a class A that carries state and lives in a multi-threaded environment. Obviously this class must be synchronized.
Consider that you can subscribe to events of class A. So let's say class you must implement an interface I that will be notified when something important in class A happens.
Let's assume that the method in which class A invokes instances of I is synchronized (A carries state, remember?)
Let's also assume that your implementation of I also carries state and must be synchronized as well.

And now let's see what happens:

Thread-1: Some event on A occurs. It's wants to notify I, but first acquires LOCK_A and then invokes method of I. Method of I tries to change state of I, so it tries to acquire LOCK_I, but it was already acquired by Thread-2.

Thread-2: Runs on I. It changes the state of I, so it acquires the LOCK_I. During the change it needs some information of A. It tries to get it, but LOCK_A was already acquired by class A.

And here we have a deadlock.

So what is wrong here?
The most wrong part is of class A: it invokes method of some other class while it is locked. BAD! Finish the lock before calling someone else! And when I say "someone else" I include the other methods of the class! (What really happened in Jetty is that in class A method f1() was synchronized. Method f1() called to f2(), which called to f3(), which called to f4(), which called to I. It was clear in f4() that no synchronization is needed. But the mistake is actually in f1()!)
So you have some member to change? acquire the lock, change them and release the lock.

In addition, the situation could be a little improved if Read-Write lock was used instead of synchronized: most of the access to class are to read data. May be if LOCK_A was split to READ_LOCK_A and WRITE_LOCK_A; and LOCK_I was split to READ_LOCK_I and WRITE_LOCK_I, it was not causing the deadlock. But this is not about preventing the situation, but about improving.

Summary

The main point of my post is that when synchronizing, find the critical section and synchronize it only! Do not call other methods (even if they are of the same class) from the critical section: gather all information before and notify everyone else after.

Initiate Infinispan with a Custom JChannel

2013-01-14T16:53:00.001+02:00

In all Infinispan sample the JGroups part is almost omitted. You are told that you should provide a configuration xml and the infinispan-core.jar contains three samples: udp, tcp and ec2.

But what happens if you don't want to use samples? What happens if you want to have a dynamic configuration that changes based on the environment/database/external configuration tool? What happens if you want to configure it from code?

It appears to be not an easy task, but not very complicated. A special interface org.infinispan.remoting.transport.jgroups.JGroupsChannelLookup exists that allows an injection of org.jgroups.Channel.

So you can write something like:

public classJGroupsChannelLookupImpl implements JGroupsChannelLookup {
   
    public Channel getJGroupsChannel(Properties p) {
        return new JChannel();  // of course the code here might be quite complex, initializing the channel with a custom configuration
    }

   
    public boolean shouldStartAndConnect() {
        return true; // change to false if you start and connect the channel yourself
    }

    
    public boolean shouldStopAndDisconnect() {
        return true; // change to false if you stop and disconnect the channel yourself
    }
}

Make sure to have a default public noargs constructor or a static getInstance() method that returns an instance of JGroupsChannelLookup.
(Personally I prefer a static method, so I have control how the instance is created and managed)

Now you need to tell the Infinispan to use your JGroupsChannelLookup:

Configuration conf = new ConfigurationBuilder().clustering().cacheMode(CacheMode.REPL_SYNC).sync().build();
GlobalConfiguration globalConf = new GlobalConfigurationBuilder().clusteredDefault().transport()
                    .addProperty(JGroupsTransport.CHANNEL_LOOKUP, JGroupsChannelLookupImpl.class.getName()).asyncTransportExecutor()
                    .build();

            
DefaultCacheManager manager = new DefaultCacheManager(globalConf, conf);

Send Mail Via Gmail

2012-10-02T14:29:00.001+02:00

I already spent some time few years ago to figure that out and now I spent this time again.
So here's the short code snippet with an example how to send mail using Java Mail and GMail.

First, pom.xml. Include the following dependencies:

<dependency>
    <groupid>javax.mail</groupId>
    <artifactid>javax.mail-api</artifactId>
    <version>1.4.5</version>
</dependency>
<dependency>
    <groupid>javax.mail</groupId>
    <artifactid>mail</artifactId>
    <version>1.4.5</version>
</dependency>

And now the code:

import java.util.Date;
import java.util.Properties;

import javax.mail.Message;
import javax.mail.Session;
import javax.mail.internet.InternetAddress;
import javax.mail.internet.MimeMessage;

import com.sun.mail.smtp.SMTPTransport;

public class SendMail {

 public static void main(String[] args) throws Exception {
  Properties props = System.getProperties();
  props.put("mail.smtps.host", "smtp.gmail.com");
  props.put("mail.smtps.auth", "true");
  Session session = Session.getInstance(props, null);
  Message msg = new MimeMessage(session);
  msg.setFrom(new InternetAddress("somebody@gmail.com"));
  msg.setRecipients(Message.RecipientType.TO,
    InternetAddress.parse("somebody_else@gmail.com", false));
  msg.setSubject("I'm the subject!");
  msg.setText("And I'm text of your message");
  msg.setHeader("X-Mailer", "Java Program");
  msg.setSentDate(new Date());
  SMTPTransport t = (SMTPTransport) session.getTransport("smtps");
  t.connect("smtp.gmail.com", "somebody@gmail.com",
    "here comes the password");
  t.sendMessage(msg, msg.getAllRecipients());
  System.out.println("Response: " + t.getLastServerResponse());
  t.close();
 }
}

And the thanks goes here.

Apache HTTP Client - Ignore SSL Problems

2012-07-01T14:47:00.001+03:00

Sometimes, especially when testing, it can be useful to make Apache HTTP Client ignore the SSL problems.
SSL problems may include ignoring of certificate trust (issuers) and host verification. The following snippet creates an Apache HttpClient with SingleClientConnManager that will ignore the SSL problems:

import javax.net.ssl.SSLContext;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
import org.apache.http.conn.params.ConnRoutePNames;
import org.apache.http.conn.scheme.PlainSocketFactory;
import org.apache.http.conn.scheme.Scheme;
import org.apache.http.conn.scheme.SchemeRegistry;
import org.apache.http.conn.ssl.SSLSocketFactory;

public HttpClient createHttpClient() {
  TrustManager[] trustAllCerts = new TrustManager[] { new X509TrustManager() {

    @Override
    public java.security.cert.X509Certificate[] getAcceptedIssuers() {
     return null;
    }

    @Override
    public void checkClientTrusted(java.security.cert.X509Certificate[] certs, String authType) {
    }

    @Override
    public void checkServerTrusted(java.security.cert.X509Certificate[] certs, String authType) {
    }
  } };
  SSLContext context = SSLContext.getInstance("TLS");
  context.init(null, trustAllCerts, null);

  SSLSocketFactory sf = new SSLSocketFactory(context, SSLSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER);

  SchemeRegistry schemeRegistry = new SchemeRegistry();
  schemeRegistry.register(new Scheme("http", 80, PlainSocketFactory.getSocketFactory()));
  schemeRegistry.register(new Scheme("https", 443, sf));

  SingleClientConnManager cm = new SingleClientConnManager(schemeRegistry);

  return new DefaultHttpClient(cm);
}

Pay Attention! In production you must use a valid SSL! Use this code for testing purposes only!

Jetty - Basic Hardening

2012-06-18T18:01:00.002+03:00

Jetty by default is shipped with two annoying features that should be turned off in production.

The first one is contexts listing. If you access the root folder, and if there is no special context configured to be a root context, Jetty will display a list of all contexts installed. While it may be nice to see it during the development, it's unnecessary information during production.

The class responsible for displaying this list is org.eclipse.jetty.server.handler.DefaultHandler and it's configured in jetty.xml. setShowContexts to false to turn off the contexts listing.

Actually you may consider to provide your own Hanlder class instead of the DefaultHanlder. The org.eclipse.jetty.server.handler.DefaultHandler is also responsible to display the Jetty's default favicon. It may be configured not to server the favicon at all by setServeIcon(false), but it does not allow to customize the favicon. So if you want to do it, you'll need a custom class.

The second annoying feature is a directory content listing - when you access a directory, the Jetty will generate a page with a list of files located inside. This configuration can be turned off per context: under <Configure class="org.eclipse.jetty.webapp.WebAppContext"> put the following init parameter:

<Call name="setInitParameter">
  <Arg>org.eclipse.jetty.servlet.Default.dirAllowed</Arg>
  <Arg>false</Arg>
</Call>

REST Best Practices: Use HTTP Status Codes

2012-05-07T15:10:00.000+03:00

When implementing RESTful service, keep in mind that HTTP already provides you with an ability to send the status code as part of the protocol. Do NOT put the error code inside the message itself!

HTTP defines five types of status codes:
* 1xx - Informational
* 2xx - Successful
* 3xx - Redirection
* 4xx - Client Errors
* 5xx - Server Errors

See Full reference of HTTP/1.1 Status Code Definitions

I'm not going to describe all HTTP Status codes here, but to give some basic tips.

1. In case of a successful request, always return the status code from the 2xx group. It's highly recommended to use not only the basic 200 code, but also additional codes. For example 201 means a new resource was created and the LOCATION header contains its path. 204 means that the response contains no content, so the client can optimize its code and not even try to read the content of the response (saves some redundant objects creation).

2. In case of an error request, clearly distinguish between the client and server errors. Client errors mean that the client has sent a bad request: it may be incorrectly formatted, unauthorized, method not allowed (e.g. Server accepts GET requests, while POST has been sent), and more.
In case of a client error, the best is to detect it as soon as possible in the server's code, to reduce amount of logic that will run, until the request is rejected. For example it's redundant to parse request body, if it misses the security header.
2.1 Send clear response to client. It can be a good practice to supply text or html content inside the message telling what was wrong in the request.
2.2 Don't log requests with a client's error above the info level. After all, it's a client's error, not server's. The only reason to log client' errors at all, is to help clients by taking look at the server logs. Make sure not to create a misleading picture of a log full with exceptions, while there is no problem at server side at all.
2.3 By receiving the 4xx code, the client must understand that it did something wrong and should correct its request.

3. In case of a server error, return 5xx error code. Usually the best choice will be to return simply 500. Do NOT put the full stacktrace into the response body. Actually put nothing in the response body. Why should a client care about the reason of server's failure. Is the database down? Is it a code problem? Whatever it is, it's not the client's business.
3.1 Sometimes, it can be nice to return 501 Not Implemented code. Usually it will happen, if you agreed on some functionality, created a prototype for it but didn't implemented it yet.
3.2 By receiving the 5xx code, the client must understand that something went wrong and that it should retry the same request. Does it make sense to retry immediately, in 5 minutes or in 5 days depends on the client. The server can return the *Retry-After* header and client should respect it.

There are more status code that can and should be used with RESTful APIs. For example 304 Not Modified allows you to save traffic and skip the response if the resource was not modified. Many status codes are implemented by the frameworks and intermediaries, for example JAX-RS frameworks will automatically return some client errors, like 405 and 415.

Summary

1. In case of successful request, always return a request from 2xx group.
2. In case of unsuccessful request, never return 2xx with embedded status code in the message. Return appropriate 4xx or 5xx status code.
3. For client errors, return 4xx. Don't log these requests above the info level. These are client's problems, not server's.
4. For server errors, return 5xx. Log the error, but don't send it to client.

Promoting tarlog-plugins

2012-05-03T15:37:00.004+03:00

In the last four years, since I have first published the tarlog-plugins Eclipse plugin, it has 4600+ downloads.
I know that many readers of this blog are actually using it.
I won't ask you why none of you has never clicked the Donate button. After all I have never did it myself...
But why you won't you share it?
You can star and/or +1 it on the project's page.
You can favorite it on the marketplace.
You can... well, the previous two are enough. But if you really want to Donate, there is a Donate button on the project's page.

P.S. And there is also an Encoder Tool. It didn't become as popular as tarlog-plugins, but it still nice for small encoding/decoding tasks.

REST Best Practices: Using HTTP Verbs

2012-04-16T10:01:00.001+03:00

The common mistake about RESTful API that they MUST be CRUD-like: Create maps to POST. Read maps to GET. Update maps to PUT. Delete maps to DELETE.

In fact this is incorrect. They MAY be CRUD-like or MAY be something else. The rule is: each resource MUST have a set of predefined operation. But not necessary CRUD. Even not necessary GET-POST-PUT-DELETE should be used. In fact did you know that HTTP/1.1 defines 8 HTTP Verbs: HEAD, GET, POST, PUT, DELETE, OPTIONS, TRACE and CONNECT? And in addition you may define your own HTTP Verbs that will behave as you wish? For example WebDAV defines the following verbs in addition to the standard ones: PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, UNLOCK.

So now, when you know that you may not stick to the standard HTTP verbs, I still advocate you to do so. Why? It's relatively clear what action these methods will do with the resource. It's may be less clear what action your own verb will do.

So let's consider that 90% of RESTful APIs will use only the standard HTTP verbs. What's important to pay attention to? The most important rule is to use the HTTP verbs correctly. The most common example of an incorrect use of the GET verb can be taken from HTML forms: in HTML both GET and POST can be used to submit data to a server from a form. But actually GET is a safe method, it means that it SHOULD NOT have the significance of taking an action other than retrieval.

So remember, Rule 1: HEAD and GET are safe methods.

Now about the Idempotent Methods. Methods are idempotent if the side-effects of N > 0 identical requests is the same as for a single request. The idempotent methods are GET, HEAD, PUT and DELETE.
Some examples what does it mean:
1. PUT can be used to create a resource. Yes, I know that the best practice say that it is POST, who creates resources and in a second I'll explain why. So let's say PUT creates a resource, but if a request is duplicated for some reason, the second request must not create a new resource and return exactly the same response as the first request. It's important to understand that various HTTP intermediaries know that PUT is idempotent and for example they may apply some caching, etc. So actually POST is much safer, especially for creating resources. But pay attention: POST may be safely used also for updates and even fetching the resource! Nobody said it is only for creation.
2. When DELETE-ing the resource, pay attention that the second DELETE request should return the same response. So if a first response returns "200 OK", the second (third, fourth, etc) DELETE requests to the same resource also should return "200 OK". Pay attention that once the resource was deleted, the GET request to this resource should return "404 NOT FOUND", but DELETE should continue returning "200 OK".

So Rule 2 will be: use idempotent methods correctly. If you cannot ensure that you use them correctly, use other methods and POST can be a good choice.

Now a little about the methods that you are unlikely to implement yourself. The behavior of the HEAD, TRACE, CONNECT and OPTIONS is well defined and they are usually implemented by the infrastructure (CONNECT and TRACE by the web servers, HEAD and OPTIONS by the frameworks, like Servlets or JAX-RS). If, for some reason, you decide to implement one of these methods, make sure your do it correctly.

Summary

1. Don't try to map CRUD to HTTP verbs. There is no need to do it.
2. Create new HTTP verbs, if needed.
3. Use existing HTTP verbs correctly, pay special attention to the safe methods (HEAD and GET) and to the idempotent methods (HEAD, GET, PUT and DELETE).
4. If you are now sure which HTTP verb to choose, use POST.

REST Best Practices: Use JSON

2012-04-11T14:35:00.000+03:00

Something that RESTful design does not care about, is the actual representation. The concept say that the resource can be represented in one representation or more. It does not affect the resource itself. The JAX-RS solved it very beautifully with a clear separation between the Resources and Providers.

And then happens a real life, and you need to implement a resource and represent it somehow. The native choice of the programmers who come to the RESTful world from SOAP is XML.
There are several reasons for it, and the main one: they were used to xml and why change something that works? Or at least worked?
Wrong! You don't need to stick to something you have used to use. You can use something better. And JSON is better. Why? The main reason: it's more simple - it has no namespaces and no attributes. It's less to write, less places to make a mistake in.
The compatibility issues with JSON are solved better, since you don't have xsds and you cannot fail with the validations. The new fields will be silently ignored by the old version and that's all.
You can always create XML out of JSON in case your client prefer to get XML. But it's not so easy to create JSON out of XML (yes, I heard of Badgerfish, but if you start your design from JSON and then add XML, you don't need Badgerfish).
JSON is also less verbose, which saves you the traffic, but really it's not the main reason to use it. The main reason is SIMPLICITY!

Now a little implementation note for those of you, who use Java: Create the classes that you want to be sent/received by the resources as POJOs. Keep them simple and prefer using String as a major data type. If you need something complex, like Date, do the formatting yourself and describe it in your documentation (SimpleDateFormat will help you). Once you have a set of simple POJOs, use Jackson to create/parse JSON and JAXB (integrated in Java 6+) to create/parse XML. Jackson comes with JAX-RS Providers. JAXB Providers are already included by the JAX-RS implementations. Thus in one shot you'll get both JSON and XML representations for your resources.

Back to the representations. And what about if you cannot use JSON? Of course such a thing can happen. For example you may want to upload image and there is no need to try to fit it in JSON. It's fine. Use whatever representation you want, but AVOID CUSTOM REPRESENTATIONS! If you feel that you must invent some new serialization mechanism that was not invented before, think twice. Are you really so unique? Really? Remember, you may have different clients. Each of them will have to know your serialization mechanism, will it be simple for them? Or you'll need to create a custom library. Remember: there are a lot of different languages, technologies and platforms, and you may need to support all of them.

Summary

1. Design your APIs to send/receive JSON.
2. So if you need XML, it can be easily added without writing a single line of code (at least with JAX-RS)
3. Need to send/receive something binary? Fine. But make your best to use standard formats and don't reinvent the wheel (until you feel that it's a must)
4. Never use Java standard serialization mechanism. If you need to serialize the Java class, read again #1 and use JSON ;)

REST Best Practices: Create a good URL

2012-04-11T10:56:00.001+03:00

REST is not a standard, therefore you are free to choose how to use it. It's very hard to say, if you do something "100% right" or "100% wrong". And still there are good RESTful APIs and bad RESTful APIs. Probably the most important part of your RESTful API are the URLs. They identify your resources. Good designed URLs make your API look good.

And here come some practices that I personally believe are best (or just good):

1. URL must uniquely identify the resource - it should be impossible using the same URL to access two different resources based on something else (e.g. header).
Example: suppose we design a RESTful API for a library. Let's say the url /books/ABC returns Winnie the Pooh for a registered user. A very BAD practice would be, if the unregistered user will get a different book for the same url. It should not matter who is the user, the same url should lead to the same book.
Now, of course our application may have security implications, so for example registered user can see the book and unregistered cannot. It's ok, return 404 NOT FOUND for unregistered user. Or eliminate the book from the book search. But NEVER return a different resource.

2. URL should be designed for further API changes - this one is kinda tricky. Suppose a registered user can add some books to a favorite list. What URL will be used?
Option 1: "/user/{userid}/favorites/" will return favorites list of a user. Sounds reasonable. But what happens if we decide to extend this feature and allow a user have multiple list, what will return this url now?
Option 2: "/user/{userid}/favorites/" will return a list of the favorites lists and "/user/{userid}/favorites/{id}" will return the specific list. Sounds reasonable. But what happens if we add a feature that allows user to share lists? Actually here we got to the point I'll discuss in the next bullet, but it's quite clear here that "/user/{userid}" should not be part of the url, right?
Option 3: "/favorites/" will return a list of the favorites lists and "/favorites/{id}" will return the specific list. Both "/favorites/" and "/favorites/{id}" resources will return the lists based on the user's privileges: the system administrator will see all lists, the user will see his lists the ones that were shared with him.

3. Security is not part of the resource identification and therefore should not be a part of the URL - in the previous bullet I have already described why adding a userid to the url is problematic. Let's talk about it a little bit more. First, you cannot rely "userid" present in the url in order to identify the user. You need a different form of authentication (unless you use username+password authentication and you put a password also in url, it's possible, but it's really a VERY BAD practice). Furthermore, the good api should not contain the definition of an authentication method within it at all. It should be possible to change the authentication method without changing the API (for example it quite common in the last few years to move from username+password to OAuth authentication). Basically the API should expect to receive the userid always, but not as an integral part of an API! With HTTP it's quite easy to put the security related stuff in the headers, and to keep the url clean.

4. If you must put additional metadata on URL, put it as a query parameter - In general all metadata you have (like security, content-type, accept content-type) should be in headers. However, sometimes it for some technical reasons it become impossible to put it in a header, so you must use URL. It's ok, but put it after the question mark in the query parameters part. Thus it becomes "optional" part and can be easily changed or removed if not needed later.

Why REST? Or "Simple vs. Easy"

2012-04-10T17:28:00.002+03:00

This is a post that I want to write for years and cannot find the correct words. I'm fully aware that hundreds of blog posts have already fully covered this subject and still I want to write my opinion in my blog. It is my blog after all, right?

So "Why REST?"
But first, I'd like to say two words. One is "simple" and the second one is "easy". Although these words sound like synonyms, actually they are not. Some simple things can be not easy. And some easy things can be not simple.
For example: breathing is easy for most people, you don't need to think about it. But is it simple? Can you make someone breathe, if he doesn't?
On contrary cleaning an apartment is simple. But is it easy? Not for me at least...

The key point here is that simple things just work. They may not be easy to achieve, but once you do it, they work. Once you clean an apartment, you have a clean apartment. For a while at least.
But when something is not simple, its quite problematic even to understand how it should work. Sometimes it's easy to start (breathe for example), but what happen if something goes wrong (stop breathe for example)?

And now back to REST.
REST is simple. Not easy, but simple. Correct RESTful APIs are very clear, very simple to understand. They may not be easy to implement, but once implemented they work. It's quite easy to troubleshoot the RESTful APIs, you can do with with a simple HTTP Proxy (e.g. Fiddler). It ensures quite a decoupling between server and client.
Yes it takes time to develop a RESTful API. Yes there is no much automation around and suddenly a developer needs to develop a layer that he got generated with SOAP.

And now to SOAP (and other APIs that are generated from wslds/custom xmls/whatever): this one is easy. Usually I can get some working API very fast. There are a lot of tools that help you. BUT, it is not simple: the tools generate a lot of code that you don't know and sometimes not very readable. The integrations become very complex (did you ever install ESB?) The versioning becomes a nightmare. And the standards don't work very well (did you ever tried to integrate Java to .NET using some complex types?)

So the bottom line is: REST is simple, but not easy. SOAP is easy at the beginning, but not simple.
Simple is good, specially if you continue keeping it simple.
And forget about easy. Nobody said that a software development should be easy.

Changing Putty Default Settings

2012-04-04T17:49:00.000+03:00

It's trivial, I should have figure it out myself, but I suffered for many years until finally I googled for it, and voila: To save a default setting in putty, open putty, change the setting, in "Saved Sessions" choose "Default Settings" and click "Save".

Integration Testing of RESTful Application

2011-12-08T17:26:00.002+02:00

In the previous posts I have described how to start Jetty from code in the beginning of the unit tests and how to initialize the in-memory HSQL database once Jetty is started.

If you have completed the steps from these posts, you should have a running application in the beginning of your tests. And now you are ready to start the actual testing of the application.

I believe that the best way to test RESTful API is to issue actual HTTP requests and since we have a Jetty server running, it becomes possible.

There are a lot of HTTP Clients available in Java. The examples below use Apache HTTP Client.
But as always first let's add a maven dependency:

<dependency>
         <groupId>org.apache.httpcomponents</groupId>
         <artifactId>httpclient</artifactId>
         <scope>test</scope>
        </dependency>

Now some convenient static methods that can be used:

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpDelete;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.methods.HttpPut;
import org.apache.http.entity.ByteArrayEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.message.AbstractHttpMessage;
import org.apache.http.message.BasicHeader;
...
   static HttpResponse executeGet(String url) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpGet get = new HttpGet(url);
        return httpclient.execute(get);
    }

    static HttpResponse executeDelete(String url) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpDelete delete = new HttpDelete(url);
        return httpclient.execute(delete);
    }

    static HttpResponse executePost(String url, byte[] body) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpPost post = new HttpPost(url);
        post.setEntity(new ByteArrayEntity(body));
        return httpclient.execute(post);
    }

    static HttpResponse executePut(String url, byte[] body) throws IOException {
        HttpClient httpclient = new DefaultHttpClient();
        HttpPut put = new HttpPut(url);
        put.setEntity(new ByteArrayEntity(body));
        return httpclient.execute(put);
    }

So your test can be something like:

@Test
    public void testGet() throws Exception {
        HttpResponse response = executeGet(url, null, null, null);
        assertEquals(response.getStatusLine().getStatusCode(), 200);
    }

Tarlog on Java

Setting environment variable from code in Java

Why you should be careful when using Lombok or other code generation tools

Running arbitrary command when using auto-completion in Zsh

How do I avoid the error "Unable to validate the following destination configurations" when using S3 event notifications in CloudFormation?

CloudFormation Tips

Print Gradle Dependencies

Dropwizard: Add thread name to log

Unix Shell: Use of functions to create complicated aliases

Pretty Format of JSON in vim

Monitor Java on Unix/Linux/Solaris

UnresolvedAddressException Tip

DevOps: Making Fast Deployments of Java Servers using Maven and Nexus

Recommended Reading

Deadlock in Jetty or Be Careful while Synchronizing

Summary

Initiate Infinispan with a Custom JChannel

Send Mail Via Gmail

Apache HTTP Client - Ignore SSL Problems

Jetty - Basic Hardening

REST Best Practices: Use HTTP Status Codes

Summary

Promoting tarlog-plugins

REST Best Practices: Using HTTP Verbs

Summary

REST Best Practices: Use JSON

Summary

REST Best Practices: Create a good URL

Why REST? Or "Simple vs. Easy"

Changing Putty Default Settings

Integration Testing of RESTful Application

Recommended Reading