Some days ago, Intel announced a new version of their C++ and Fortran compilers. According to their announcement:
Application performance is also accelerated by multi-core processors through the use of multiple threads.
So... as far as I understand, and as some other news sites mention, this means that the compiler tries to automatically parallelize a program by creating multiple threads; the code executed on each thread is decided at build time through some algorithm that deduces which blocks of code can be executed at the same time.

If this is true — I mean, if I understood it correctly —, it sounds great but I don't know to what level the compiler is able to extract useful information from code blocks in either C++ and Fortran. These two languages follow the imperative programming paradigm: a program written in them describes step by step how the machine must operate. In other words: the program specifies how a specific type of machine (a load/store one) must behave in order to compute a result, not how the result is computed.

Extracting parallelization information from this type of languages seems hard if not impossible except for very simple cases. Even more, most imperative languages are not protected against side effects: there is a global state that is accessible from any part of the program, which means that you cannot predict how a specific call will change this global state. In terms of functions: a function with a specific set of parameters can return different values on each call, because it can store auxiliary information on global variables.

It seems to me that functional languages are much more suitable to this kind of compiler-level parallelization than imperative ones. In a functional language, the program describes how to compute a result at an abstract level, not how to reach that result by a specific type of machine. The way the compiler arrives to that result is generally irrelevant. (If you know SQL, it has the same properties: you describe what you want to know through a query but you don't know how the database engine will handle it.) Furthermore, and this is important, purely functional languages such as Haskell do not have side effects as long as you don't use monads. So what does this mean? A function, when called with a specific same of parameters, will always return the same result. (So yes, the compiler could, and possible does, trivially apply memoization.)

With these characteristics, a compiler for a functional language could do much more to implement compiler-level parallelization. Each call to a function could be analyzed to see which other functions it calls, thus generating a call graph; later on, the compiler could decide which subset of this graph is sent to each computing core (i.e. placed on an independent thread) and merge the results between threads when it got to processing the top level function it split. So if you had an expression such as:

foo = (bar 3 4) + (baz 5 6)

The compiler could prepare two threads, one to compute the result of bar 3 4 and one to calculate bar 5 6. At the end, and after the two threads finished, it could do the sum. Of course, bar and baz could have to be "large" enough to compensate the time spent on creating and managing the threads.

Anyway, what I wanted to emphasize is that depending on the language you choose, doing specific types of code analysis and optimization can be much easier and, of course, much better.

To conclude, and as I'm talking about Haskell, I'd like to suggest you to read the article "An introduction to Haskell, part 1" recently published at ONLamp.com. It ends talking about this idea a bit.