Solving some interesting problems in the project I'm currently working on I started to analyse following problem: you have a large data set (eg. a file) on which you need to apply some expensive transformation (eg. compression). The key requirement is a performance.
The best performance demands can usually be effectively fulfilled using multi-threaded algorithms, especially on multiple CPU cores hardware. But you may not to have a multi-threaded transformation algorithm, or you can get a requirement to have many algorithms pluggable, not always designed for parallel work.
I started to think that it's easy to divide source data for chunks and start a thread pool to do the job. This is feasible when you can have multiple output streams, but if you want to write everything to a single file may be a little problematic. If you synchronize all threads to write to a single stream, you rather end up with multi-threaded algorithm working as well as simple single-threaded one. You will simply have no benefits from concurrency, when all threads need to wait for currently writing thread, having a lock. They simply don't do the transformation in the time they wait for the lock. In the result - only one thread works effectively.
If you want to have benefits from concurrency you need to design it in the way, that all threads can do their expensive work independently from the others, without waiting and synchronization.
The conception is following. We have N working threads, and each one is working its way and has been started in different time, in proportion to the others. If we had N buffers with previously established size for each thread, we could assert concurrent writing to these buffers from independent threads without any collision. If the buffer is overfilled, we can flush it to the output stream. If the lock is gained only for flushing, we potentially have other threads working in this time on their buffers, without waiting to have resource available. A Java implementation of this issue, conformed to input/output streams contract in Java is simple (I use apache commons to shorten time required to perform obvious things):
The static main() method shows an exemplary usage of the stream from multiple threads. You can manipulate constants to check how does it perform.
If we have such multi-threaded stream and the transformation algorithm performance depends on the input data, the buffers are filled in very random way and we have real benefits from multi-threading. I tested this and it performs very well.
But the output file made this way consists of random blocks of interleaved stream data, written to a single file. So how to handle with restoring data from source file (to apply reverse transformation)? This is why we put into the target stream the thread index and block size value before each block. We can now restore this sequentially, reading each thread source stream separately, or we can do this concurrently, using a thread pool with same threads count as in writing mode. All information can be easily retrieved from interleaved stream:
Also this time main() method should get the file from output stream test, read it in thread pool, and perform simple checksum check.
Depending on requirements, ThreadedInputStream class can be used as a source for multi-threaded or sequential algorithm as well. You can start N threads to get N-stream to fetch data from it, or you can freely iterate through stream data entries in a single thread (in this case you need to perform N iteration).
I'm presenting this solution as a curiosity, but also as a real, working algorithm I used in the project. It works well for solving problems, when you want to parallelize sequential transformation algorithm (in my case it was compression) on a potentially large input data. Of course there's another question how to chunk source data, but this will depend on the source data itself. For unstructured file you can for example divide it to equal parts, and start N reading streams with different starting read index.