StefanKarpinski

http://www.reddit.com/user/StefanKarpinski

Highest Rated Comments

StefanKarpinski153 karma2020-07-27 16:58:14 UTC

Quite well, I think. You can write very simple Python-like code and it will, if you mind a few performance considerations, run as fast as C code—and sometimes faster. I think the comparison falters in two areas:

Compiler latency. Since Python is interpreted, it doesn't have any compilation lags. Of course, as a result it's slow, but sometimes you don't want to wait for a compiler. It's an ongoing challenge to reduce compiler latency. A huge amount of progress has been made for the 1.5 release, reducing compilation latencies by 2-3x in many common cases. However, we're starting to change tack and focus on being able to statically compile and save more code, which will help even more since most code doesn't need to change from run to run.
Perception of complexity. One of the great things about Julia is that it has this whole sophisticated, well-considered types system for when you need it. But people show up and want to learn Julia and are a little intimidated, thinking they need to learn all about types in order to write Julia code. I've tried to convey to people that they really don't, but that's a hard message to get through while also encouraging the people who want to learn that stuff. There's a perfectly useful, high-performance dialect of Julia that many people can use without ever needing to write any type annotations.

Somewhat ironically on the last point, Python has been adding type annotations in recent versions, so there's some convergence here. However, Python's type annotations are really just comments with special format that can be type checked—they add no expressiveness to the language. In Julia, on the other hand, using types and dispatch to express things is really, really useful. My casual impression is that the coherence of the mypy type annotations is a bit iffy, especially when it gets into parametric types. Julia's type system, on the other hand is very carefully thought through.

View History Share Link

StefanKarpinski42 karma2018-08-15 17:36:33 UTC

and see the assembly code for it!

Of course, this is true in any compiled language, but in Julia it's really easy:

julia> f(x, y) = 2x^2 + 3y - 1
f (generic function with 1 method)

julia> f(4, 5)
46

julia> @code_native f(4, 5)
    imulq   %rdi, %rdi
    leaq    (%rsi,%rsi,2), %rax
    leaq    (%rax,%rdi,2), %rax
    addq    $-1, %rax
    retq
    nopw    %cs:(%rax,%rax)

That easiness makes all the difference—I almost never bother looking at the assembly code in C because it's such a pain to do. I look at Julia assembly code all the time because it's so easy and I know lots of others do as well. This is both amazing as a learning tool and as a way to guide the entire ecosystem towards great performance.

View History Share Link

StefanKarpinski36 karma2020-07-27 19:30:20 UTC

If you didn't have the compatibility constraint of version 1, what would be the first thing you'd change in the language?

I keep a list of potentially breaking changes we'd like to make in 2.0, but none of them are really that huge—which means that without bigger motivations, we probably shouldn't break them.

For 2.0, rather than thinking about what to break, I think we need to think about solving really big problems without worrying about compatibility—much like you get to when you're designing a new language from scratch. But then we take those ideas and see how we can connect them back to the reality of Julia 1.x, minimizing the breakage and figuring out a transition path. For some of these things, we might realize that we can do them without breaking anything, but you need to give yourself room to think grand, potentially disruptive thoughts in order to get there.

We've talked a lot about embracing immutability more. It's just so good for the compiler to know that something can't change and can be safely shared and/or copied. Most of the time mutation is only used to construct the initial structure of arrays and then they are transformed from one value to another by non-mutating mathematical functions. Compilers are actually really good at figuring out they can do that kind of thing in place ironically if they are guaranteed that the values are immutable and changes cannot be observed externally. So I'd love to see the introduction of immutable arrays and more core APIs returning immutable arrays. But along with that, you need the ability to "unfreeze" and mutate those arrays in place (i.e. when the compiler can prove that no one is looking).

We probably also want to be more aggressive about parallelism and make more things in the language implicitly multithreaded. That's often potentially going to break someone's code, so you can't just do it, but if we do that in a 2.0 release, then it's fair game.

What is the greatest potential (By changing the language) to gain the next significant step in performance in your opinion? What are you focused on solving to achieve that?

Multithreading. It's already there, but it takes a long time to shake out something as hard to get right as a completely general, composable threading system. Julia 1.3 was a huge milestone here and 1.4 fixed and improved a lot of things. 1.5 stabilizes most of the threading API and 1.6 will stabilize the rest. We're starting to see more and more good uses of threading in packages, but we need to also add more threading to the stdlibs so that people start getting speedups "for free". That's a bit tricky to do thought since if it's not done very carefully, it can break people's code. (See my last point above for 2.0).

Do you see in the future such capability form Julia? Generating static / dynamic libraries without the pre compilation overhead to be integrated into production?

Yes, this is actively being worked on. It will be possible to generate statically compiled shared libraries from Julia code in the future.

View History Share Link

StefanKarpinski33 karma2018-08-15 17:01:22 UTC

The main thing that makes Julia special is a deep commitment to multiple dispatch and performance. Multiple dispatch is a feature not many languages have and even the ones that do tend to relegate it to being a kind of fancy feature that you use on rare occasions but not in regular programming and it usually has fairly massive overhead. In Julia _everything_ uses multiple dispatch. Operations as basic as integer and floating-point addition and array indexing are defined in terms of it. The most ubiquitous, performance-sensitive operations go through the same machinery as user-defined types and operations. That means that you can define your own types and use multiple dispatch as much as you want without being afraid of the overhead. This is transformative in terms of how people use the language, and as a result, how the ecosystem develops. What we've seen as a result of this is that people define lots of very low-overhead types and use multiple dispatch all over the place. The effect is that different parts of the ecosystem compose really well and you get a multiplicative effect where the amount you can do is a product of all the components available to you instead of just the sum.

A really great example of this in action is the talk at JuiaCon this year by Robin Deits from one of MIT's robotics labs where all the different unrelated packages by different authors just composed to create this fully functioning robotics system. And then he just threw in as an aside that you can start with special measurement error values using the Measurements package and it just passes through all the layers, including rigid body dynamics calculations and differential equations solvers and in the end you get estimates of how accurate your computations are based on initial measurement error. And the kicker: the whole system is faster than real-time to the point where they have to insert sleep calls into the code to slow it down to sync up with real robotics systems.

View History Share Link

StefanKarpinski28 karma2018-08-15 18:01:58 UTC

A few years ago I don't think anyone would have believe that Julia's package ecosystem would be where it is today. There's a number of areas where it's hands down better than anything else—optimization, differential equations, linear algebra, numerical analysis (too many packages to even link a single one), among others. And the same momentum that has allowed us to catch up and surpass other ecosystems in such a short time continues.

Julia already has OpenMP-style for loop multithreading, but we're nearly ready to merge a pull request that implements M:N mapping of tasks onto hardware threads, which will make Julia's threading system similar to Go's but tuned for extreme computational performance rather than for writing concurrent servers (although you can do that as well). As we see the number of cores on CPUs go up and the amount of memory per core go down, this will increasingly be a huge competitive advantage over other dynamic languages that don't have any plausble answer to multithreading.

View History Share Link