Today we’re happy to announce Ploomber’s integration with RStudio. You can now use Ploomber to develop modular and production-ready R pipelines interactively.
Adding support for R Markdown
RStudio’s most popular file format is R Markdown .Rmd
, which allows users to mix R code and Markdown in a text file:
Some text
```{r}
# some code
df = read.csv(upstream$clean$data)
head(df)
```
R Markdown and RStudio are highly popular in the data science community, and we want to bring Ploomber to them. All the features available to Python users work for R: incremental builds, parallelization, even running in the cloud!
Furthermore, people may feel more comfortable with R and others with Python; with Ploomber, you can use both R and Python in the same pipeline. For example, let’s say you’re working with some data and need to apply a statistical method that has only been implemented in R; you can easily integrate a Python script with an R Markdown file:
# load data with Python
- source: tasks/load.py
product:
nb: out/load.html
# output data
data: out/raw.parquet
# process data with R (use out/raw.parquet as input)
- source: tasks/some-statistical-method.Rmd
product:
nb: out/report.html
data: out/results.parquet
However, bear in mind that this makes the project setup more complex, so if possible, consider using a single language.
Note that the .Rmd
file in the example above generates an output report out/report.html
: Ploomber converts .Rmd
files to Jupyter notebooks (.ipynb
) at runtime and then executes it, so every pipeline execution generates an output report. So you may change the extension (like we did in the example) to .html
if you want Ploomber to convert the output report. Furthermore, you can use plain .R
if you prefer, and they’ll work in the same way, and if you’re using another editor that supports .R
or .Rmd
works as well.
Setting up
An R installation and the IRKernel
package must be installed and configured for Ploomber to execute pipelines. Check out our documentation to learn more.
Try it out!
Try out an example pipeline by executing the following commands in the shell terminal:
# install ploomber
pip install ploomber --upgrade
# get R example
ploomber examples -n templates/spec-api-r -o example
# execute
ploomber build
# inject input paths to each task
ploomber nb --inject
Now open plot.Rmd
and start running things interactively, then go back to the terminal and run:
ploomber build
To rerun your pipeline.
Running in the cloud
Do you need more computing power? We got you covered; you can export your pipelines and execute them in Kubernetes, Airflow, AWS Batch, and SLURM without code changes. Check out our documentation for details.
Closing remarks
Many of the teams using Ploomber use both R and Python to develop their pipelines, and they asked us for recommendations to enhance collaboration. Let us help you ship data products faster. Join our community, and we’ll be happy to answer all your questions.