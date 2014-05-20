For years, specialized tools like R and SPSS have been standard for anyone working in statistics and analytics, whether in private industry or the academic sector.

But for many in the business world, those languages are unfamiliar, which means companies haven’t been able to leverage these languages the way that, say, universities have. But that could change thanks to an open-source Python data-analysis library called Pandas, which offers many of the same analytics tools as R in a language developers are already using, but in languages businesses can work in.

“One of the reasons we like to use Pandas is because we like to stay in the Python ecosystem,” says Burc Arpat, a quantitative engineering manager at Facebook. “We have a lot of systems inside Facebook, or infrastructure that allows us to either use Python to talk to those systems or integrates with Python very easily or is written in Python.”

Many of the engineers working on analytics projects at Facebook are well acquainted with R, which the company also uses for certain tasks, but Facebook’s existing Python codebase often makes Pandas easier to work with. That’s a common reason for developers to choose Pandas, says the library’s creator, Wes McKinney.

“In companies that have an engineering culture, just because of the general growth of Python, it’s made Python and Pandas an easy choice,” he says.

McKinney began work on the library while working at the financial firm AQR Capital Management, where he was using R for quantitative finance projects and basic “data wrangling,” he says.

“I was frustrated on multiple fronts,” he says. “I felt that R was not strong enough for software engineering–for building big software, R left a lot to be desired as far as the tooling for debugging and building big systems.”