David Smith

by Nick Elprin, Co-Founder of Domino Data Lab We built a platform that lets analysts deploy R code to an HTTP server with one click, and we describe it in detail below.  If you have ever wanted to invoke your R model with a simple HTTP call, without dealing with any infrastructure setup or asking for help from developers — imagine Heroku for your R code — we hope you’ll enjoy this. Introduction Across industries, analytical models are powering core business processes and applications as more companies realize that that analytics are key to their competitiveness.  R is particularly well suited to developing and expressing such models, but unfortunately, the final step of integrating R code into existing software systems remains difficult.  This post describes our solution to this problem: “one-click” publishing of R code to an API server, allowing easy integration be... (more)

R tops KDNuggets data analysis software poll for 4th consecutive year

KDNuggests asked its readers the question "What programming/statistics languages you used for an analytics / data mining / data science work in 2014?" and one again, R was the #1 response. (R was also the #1 response in similar polls in 2013, 2012 and 2011.) The top 5 selections of the 719 respondents were: R (352 respondents) SAS (262) Python (252) SQL (220) Java (89) Respondents were able to select multiple languages, which is why the totals add up to more than the number of respondents. In fact, the analysis of data science software used together is quite interesting. Looking ... (more)

Statistics: Losing Ground to CS, Losing Image Among Students

by Norman Matloff The American Statistical Association (ASA) leadership, and many in Statistics academia. have been undergoing a period of angst the last few years, They worry that the field of Statistics is headed for a future of reduced national influence and importance, with the feeling that: The field is to a large extent being usurped by other disciplines, notably Computer Science (CS). Efforts to make the field attractive to students have largely been unsuccessful. I had been aware of these issues for quite a while, and thus was pleasantly surprised last year to see then-... (more)

Entering the field as a data scientist with certification

By Neera Talbert, VP Services and Ben Wiley, R Programmer at Revolution Analytics By now, everyone should be familiar with the data scientist boom. Simply logging onto LinkedIn reveals a seemingly infinite number of people with words and phrases like “Data Scientist”, “Big Data Specialist”, and “Analytics” in their title. A few weeks ago, an article floated around the internet about how R programmers are the highest paid software engineers in industry. But the career of a data scientist is hot not only because it’s highly lucrative; drawing conclusions from data is itself a rew... (more)

Revisiting package dependencies

by Andrie de Vries In my previous post I wrote about how to identify and visualize package dependencies.  Within hours, Duncan Murdoch (member of R-core) identified some discrepancies between my list of dependencies and the visualisation.  Since then, I fixed the dispecrancies. In this blog post I attempt to clarify the issues involved in listing package dependencies. In miniCRAN I expose two functions that provides information about dependencies: The function pkgDep() returns a character vector with the names of dependencies. Internally, pkgDep() is a wrapper around tools::pac... (more)