Companies can spend thousands of pounds sending their technicians on training courses. Most are money well spent. But there are many people who like to discover new techniques for themselves and prefer home study to the formal office or classroom setting. Also, freelance developers cannot always afford the four-figure costs that modern software houses charge to take their courses.
To fill this gap, there is a burgeoning industry of self-help manuals that introduce you to the software of your choice. Following my recent review of the open source data integration tool - Talend's Open Studio, I thought it might be useful to follow up with a review on an interesting publication that can help you get started using it.
The book is called "Getting Started with Talend Open Studio for Data Integration" and is written by Jonathan Bowen. It is available as an e-book ($22.94) and also in printed form with free ebook ($44.99). There is also a Kindle version at $19.47. The book can be purchased from amazon.com, amazon.co.uk, Barnes and Noble and Safari Books online.
Following a brief and important introduction, the instruction starts with showing you how to download Open Studio from Talend's website and guides you through the installation of the software onto your PC. The Talend software comes with example data and jobs. There is an appendix in the book that shows you how to install the sample data. The book effectively uses the sample data to walk you through the basics of file transformation, then moves swiftly on to working with databases.
For the database examples, you will need to download and install MySQL, which is an open source database that can run anywhere. You will also need the tools to administer it. The book gives you the trusted links to download MySQL, but you will need to refer to the MySQL documentation on how to install and get running. However, this is really worth doing. MYSQL is free and very easy to use.
Once you have MySQL running, the rest of the book really flies. You start to learn the really useful stuff for ETL, like connecting to databases, creating and amending tables, filtering, sorting, enriching, normalising and aggregating data. Once you are proficient, the book turns to automation, orchestration, file transferring and the generation of variables. You can then join individual jobs together to make flow processes that can make decisions and take different actions based on many different outcomes.
If you think that all this sounds complicated, you are going to be shocked. It is not. If I can do it, most people can. You can read about what I did in a weekend with the Open Studio here. However, if you are installing Talend on Mac OS-X or a PC with Ubuntu or any other unix operating system, the book's file paths are for windows computers (i.e. C:\\My Documents etc...) but I'm sure you can work that out.
The whole layout of the book is very straight-forward, with plenty of pictures on how your work should look. The language is simple and free of jargon. Explanations are just the right balance of detail and simplicity. It is obvious that a lot of care and consideration has gone into making each chapter informative yet succinct.
Maybe you wish to deploy and develop Talend for data integration at work. Perhaps you want to be a freelance DI developer. You may just want to run a little computer project at home. Whatever your requirements, "Getting Started with Talend Open Studio for Data Integration" by Jonathan Bowen is a valuable reference that you will use again and again.