An excellent programming language for data analysis
July 20, 2014 5:48 PM   Subscribe

"Julia is a high-level, high-performance dynamic programming language for technical computing." The language is elegant (homoiconic, multiple-dispatch, consistent and extensible type system), but with easy-to-learn syntax. The standard library includes a wide array of fast and useful functions, and the number of useful packages is growing.

The creators of the language explain why they designed it.

It's very easy to install -- just head to http://julialang.org/downloads/. External packages can be installed by calling Pkg.add(package_name); this will handle dependencies both inside and outside of Julia. I was pleasantly surprised by how simple the process was, especially for a young language.

Particularly excellent is its interface to IPython, which provides a surprisingly polished Mathematica-style notebook interface.
posted by vogon_poet (33 comments total) 40 users marked this as a favorite
 
and the number of useful packages is growing.

I'm intrigued by Julia, since it seems to be attempting to combine the best bits of Python and R. But unfortunately that package list is still pretty pathetic compared to what's available for R - I can't find a single package for dealing with spatial raster/vector data yet, which makes it useless for my purposes. But I'll be watching, with interest.

Particularly excellent is its interface to IPython

I would be much happier if they'd followed the example of RStudio rather than interfacing with iPython, but that's just a personal thing, I guess.
posted by Jimbob at 5:58 PM on July 20, 2014


Graydon Hoare (the original creator of Rust) is excited about Julia.

More on Julia's Lisp heritage.
posted by mbrubeck at 7:23 PM on July 20, 2014


"Julia uses the type system in all the ways that don't end with the programmer arguing with the compiler." attributed to Leah Hanson
posted by d. z. wang at 7:44 PM on July 20, 2014


Well, all I know is if I programmed in Julia, half of what I say would be meaningless.
posted by symbioid at 7:49 PM on July 20, 2014 [1 favorite]


also - eponysterical FPP?

I wish I had time to really truly grok everything about programming languages. The older I get the dumber I feel, but the more I want to learn and understand and the less time I feel I have.

I look forward to diving in and seeing the hype - though I imagine for me, it'll be one more abstract language to read about. I'm slow enough and lazy enough that c# and unity is all I actually work with and that's not often enough as it is, let alone learning other languages, let alone new exciting ones with plenty of new things to learn about.
posted by symbioid at 8:21 PM on July 20, 2014


Julia got a really terrible piece of publicity in February from Wired. It was so infuriatingly stupid that I have still refused to look at it, even though I'm exactly in the target audience (former R user, currently use Python/IPython daily for data analysis).

I expect that I'm not the only one. And I'm not that sure that there isn't some kernel of truth to the idea that Julia is doing nothing new. The PR they put out is certainly overwhelming in tone, which doesn't seem like a great sign.
posted by TypographicalError at 8:38 PM on July 20, 2014 [2 favorites]


I just went back and reread the Wired Julia article more closely:
It so happened that Bezanson had recently made a study of language design, and had come to the conclusion that the tradeoffs inherent in most languages were avoidable. “It became clear that a lot of it had been designed haphazardly,” Bezanson says. “If you started from the beginning, you could recreate the things that people liked about those languages without so many of the problems.”
I had forgotten about that tidbit. Yeah, I'm never using a programming language where one of the creators seems to know so little about how the process of creating them works.
posted by TypographicalError at 8:41 PM on July 20, 2014 [2 favorites]


I like Java for its performance and relatively clean language design. I hate it for its excessive verbosity and anything to do with file IO.
I like Python for its readability and list comprehensions. I wish it ran faster and had Ruby-style blocks.
I really like Ruby, but it's dog-slow and has no formal language definition, other than 10,000 lines of MRI code.
I admire C++ for its speed. Other than that I hate everything about it.
I like Haskell's typing and functional emphasis. I wish I could understand its file IO without needing to grok monads.
I wish I could code more in Common Lisp, but I find it hard to read prefix notation and (as with Haskell) mutable state can sometimes be really handy.
I tried Go and it felt to me like a slightly less verbose Java. It is not blazingly fast (yet) and doesn't support map or fold/reduce style functions, but I would be OK if it became the next widely-adopted systems language as long as they can make it run faster.
I have no comment on Javascript other than that I feel lucky to have avoided coding in it.
I'm curious to try out Clojure and Rust.

I wonder what positives and negatives I would find in Julia.
posted by A dead Quaker at 9:21 PM on July 20, 2014 [7 favorites]


Not that it's a perfect language (and apparently its creators have delusions of grandeur, or at least Wired made them look that way), but it hits most of your positives and avoids most of your negatives. Particularly: faster than the competition, readable, functional, good type system (dynamically typed with optional type specifications), you don't need to understand category theory to do I/O. There are python-style list comprehensions.

Probably the downside would be that it's not really a general-purpose language right now: it's set up basically as an R-like environment for scientific computing.
posted by vogon_poet at 9:30 PM on July 20, 2014


it's set up basically as an R-like environment for scientific computing.

Except less usable and lacking much of the functionality of R, especially when it comes to the core work of, for example, statistics and plotting charts. Which isn't dissing on the language as a whole, just at the moment there is very little to entice R users to switch to it - least of all the years of refining and reviewing the statistical functions that has gone into R.
posted by Jimbob at 9:36 PM on July 20, 2014 [1 favorite]


A dead Quaker: "I tried Go and it felt to me like a slightly less verbose Java. It is not blazingly fast (yet) and doesn't support map or fold/reduce style functions, but I would be OK if it became the next widely-adopted systems language as long as they can make it run faster."

Derail, and I sympathize with most of your list, but Go isn't a systems language. It's a niche language for Python programmers who want better performance but don't want to leave behind garbage collection or their syntactic comfort zone. It has nice lightweight concurrency primitives, but it's not the only language that does, and IMO there's no place for a language in 2014 with a type system as brittle and archaic as Go's.
posted by invitapriore at 10:01 PM on July 20, 2014


If you want to like Go but don't, give Erlang a try.

Stop laughing.
posted by TheNewWazoo at 10:49 PM on July 20, 2014 [2 favorites]


Wired has a way of making their interview subjects look like they have delusions of grandeur. It's like megalomania by proxy or something. I wouldn't read too much into it.
posted by en forme de poire at 11:00 PM on July 20, 2014 [5 favorites]


If you want to like Erlang but don't, give Elixir a try.
posted by Phssthpok at 1:54 AM on July 21, 2014


A dead Quaker:

On the "Python being faster" front, you might have heard of/be interested in Pypy, a JIT compiler for Python. I admit I want the Ruby block syntax too. Python's block could be so much nicer.

As far as Julia, it looks *extremely* interesting. The type system and multiple dispatch makes it look like you could do some extremely extensible things with regards to adding custom types to the inbuilt functions, or doing clever things with, in a Web context for instance, render functions that are custom based on returned values from handler functions.

Now I want to go write code. :)
posted by aurynn at 2:35 AM on July 21, 2014


I tried Go and it felt to me like a slightly less verbose Java. It is not blazingly fast (yet) and doesn't support map or fold/reduce style functions, but I would be OK if it became the next widely-adopted systems language as long as they can make it run faster.

Go is not really a systems language. It's a language for writing web servers in, and benchmarks suggest it is very fast in that niche.
posted by a snickering nuthatch at 4:16 AM on July 21, 2014 [1 favorite]


I tried out a lot of the languages mentioned and the only one that really stuck was Python. I'm back to Ada because most of what I write isn't "let me whip something up quick to slice and dice a data set" (which Python does fine), but "this needs to last and not be brittle and full of security holes".
posted by kjs3 at 8:19 AM on July 21, 2014 [2 favorites]


If you have a more contemplative bent, Chapel might be for you.
posted by Monday, stony Monday at 10:49 AM on July 21, 2014


I haven't seen a direct comparison, but Julia appears to be faster than LuaJIT (by looking at their comparisons with C), which I've considered the JIT darling for a while now. Does anyone have any more concrete information on this?
posted by hanoixan at 11:06 AM on July 21, 2014


Go isn't a systems language. It's a niche language for Python programmers who want better performance but don't want to leave behind garbage collection or their syntactic comfort zone.

Go is not really a systems language. It's a language for writing web servers in

It's not clear to me that Go is intended either to be a replacement for Python or for web servers specifically. I've heard it's being used internally to Google and I'm curious for what purposes. They had an interesting blog post on concurrency with data pipelines a while back, which suggests to me they are using Go for high-volume data processing, maybe. (Rob Pike previously worked on Sawzall, a language used mainly for processing Google's search logs.) In any case, the blog post is talking about stuff that they wouldn't use Python for at Google. The other two main languages there are Java and C++, and the emphasis is on C++.

Maybe "systems language" wasn't the right term. What I'm curious about is whether Google is going to start rewriting their backend server code in it (phasing out C++) or if it will only occupy its own special niche of some kind. I get that it's good for situations where you're blocking on IO a lot, but I don't see it as a C++ replacement at Google until its single threaded execution is significantly faster. (The speed tests I've seen so far haven't impressed me, but it's a young language, and I bet they have plans for a lot of optimizations.)

All that said, I'm not thrilled with the syntax.

Not that it's a perfect language...but it hits most of your positives and avoids most of your negatives.

Yeah, Julia sounds like something I would like, but it seems like there's always some tradeoffs, maybe in unexpected ways. It's definitely high on the list of languages I want to try though.
posted by A dead Quaker at 11:54 AM on July 21, 2014


They don't really call Go a "systems language" anymore. As I understand it, they do use it for a good amount of new stuff, but not necessarily to replace C++ or Java. For instance, some of the support code that's written for BoringSSL (Google's OpenSSL fork) is in Go.

Go does have a big advantage over Python in that it doesn't suffer from the Global Interpreter Lock. It also has good concurrency primitives, and its standard library is starting to really take shape. So my guess for its internal use would be stuff that doesn't need to be blazingly fast, but that may involve concurrency and benefit from more speed than what Python offers.
posted by Monday, stony Monday at 12:06 PM on July 21, 2014


Yeah, I haven't used Julia yet but there are some other people in my research group who are using it for (I believe) processing evolutionary genomics data. In addition to speed, the straightforward parallelization syntax is really attractive for bioinformatics where there are a lot of trivially-parallelizable problems. And even if those benchmarks end up being only sort of reflective of the differences between Julia, Python, MATLAB and R, that's still a game changer. The main thing with any new language is always libraries but those seem to be getting made quickly, which is a good sign.
posted by en forme de poire at 12:11 PM on July 21, 2014


I wish I could code more in Common Lisp, but I find it hard to read prefix notation and (as with Haskell) mutable state can sometimes be really handy.

Common Lisp has mutable state all over. You can rplaca and rplacd the car and cdr of any cons cell, plus it has native arrays and mutable records. And this isn't low-level stuff for the library writers either; it's perfectly idiomatic to accumulate a list using nconc.
posted by d. z. wang at 9:24 PM on July 21, 2014


Yeah, Common Lisp is "multiparadigm." Like python some operations may or may not have side effects depending on implementation unless you read the CLHS very carefully.
posted by CBrachyrhynchos at 6:42 AM on July 22, 2014


> Go isn't a systems language. It's a niche language for Python programmers

Gadzooks, no. Go has little to appeal to Python programmers, it's really aimed at other users of compile languages like C++ or Java - and its creators explicitly conceived of it as a systems language.

> Go is not really a systems language. It's a language for writing web servers in.

So what do you define a "systems language" as exactly? Surely you'd write a web server in a "systems language"?
posted by lupus_yonderboy at 8:34 AM on July 22, 2014 [1 favorite]


Oh, and if you want to compile Python for speed, interface Python with a C or C++ library, or port your Python to C++ (again for speed), might I suggest Cython, which has worked extremely well for me?
posted by lupus_yonderboy at 8:36 AM on July 22, 2014


So what do you define a "systems language" as exactly?

A systems language can be used to write an operating system that runs on bare metal. As practical consequences of this, it must allow the user to control the layout of structures in memory, and it's extra nice if it allows inline asm.

Surely you'd write a web server in a "systems language"?

No, Not necessarily.
posted by a snickering nuthatch at 8:47 AM on July 22, 2014


I grew up on C++, but now all I find myself using is Perl. I had a job where I programmed in Common Lisp, and I thought Julia was exciting as possibly something with a powerful macro system. Geeze, when can we have PERL6 already?
posted by I-Write-Essays at 9:08 AM on July 22, 2014


> A systems language can be used to write an operating system that runs on bare metal.

Good answer. I guess I'm not sure if Go really fits this or not then...
posted by lupus_yonderboy at 12:57 PM on July 22, 2014


A systems language can be used to write an operating system that runs on bare metal.

How many "systems languages" do we really need, then? It seems the complaint that "Oh X isn't a systems language" is pretty pointless, since very, very few people are going to sit down amd write a new OS.
posted by Jimbob at 1:53 PM on July 22, 2014


I think the majority of languages these days have a version that can compile into assembly.
posted by CBrachyrhynchos at 5:20 PM on July 22, 2014


lupus_yonderboy: "Gadzooks, no. Go has little to appeal to Python programmers, it's really aimed at other users of compile languages like C++ or Java - and its creators explicitly conceived of it as a systems language."

In intent, maybe, but practically speaking a lot of its growth has been driven by people who wrote web apps in Python and then hit a performance wall. I tend to think that if Python is too slow for your computationally-unintensive web app then it's more an indictment of your architecture than the interpreter, but that's still my impression of why Go is popular.
posted by invitapriore at 5:55 PM on July 22, 2014


It seems the complaint that "Oh X isn't a systems language" is pretty pointless,

I wasn't complaining that Go isn't a systems language; I was just pointing out a fact w.r.t. the definitions I think are common. Go really is fast for what it is intended for, and that's great. The only complaint I would have about it would be its silly type system.

since very, very few people are going to sit down amd write a new OS.

Lots of people are interested in running on bare metal (especially in the context of embedded devices), even if they're not interested in developing complete operating systems.

There are so many tradeoffs (and so many possible differing preferences w.r.t. those tradeoffs) that it makes sense that there are (and should be) a variety of systems languages. C, C++, Ada, ATS, and Rust (and I'm sure many more) all feel different; they ask different things of a programmer and they give different things back. It's all good.
posted by a snickering nuthatch at 4:04 AM on July 23, 2014


« Older Mario Kart 8: The Wii U's ultimate power-up?   |   #3 Newer »


This thread has been archived and is closed to new comments