Python and R are the programming languages used for analyzing and calculating scientific data. Python can be used for many things; like making websites, and creating artificial intelligence, which makes it great if we need a language for various tasks. On the other hand, R is good at working with statistics and visually showing data. This makes it popular with statisticians and data scientists. Both languages have many helpful tools we can use and communities of people who use them. Python is suitable for many things, and R is excellent for statistics. When we use them together, we can do a wide range of things in exploring data.
Using R and Python simultaneously is relatively easy, especially when R scripts are ready. Making them work from Python only requires a single line of code.
Let’s say we’ve prepared the following R code.
cat("Hello from R")
We can call the R code from Python as follows:
import subprocessres = subprocess.call("Rscript script.r", shell=True)res
Line 1: We start by importing the subprocess
module. This module provides a way to create new processes, connect to their input/output/error pipes, and return their outputs.
Line 3: We create a variable named res
to capture the result of the command we’re about to execute.
The subprocess.call()
function is used to run the command "Rscript script.r"
. The shell=True
parameter indicates that the command should be executed through the system shell.
We store the output in the res
variable.
This method works when the given R script executes tasks sequentially. However, it falls a bit short if we intend to utilize the output from our R code within the Python code. This limitation is precisely what the following option is for.
Consider the following code snippet of R, where we define a function for adding numbers and call it twice.
add_two_numbers <- function(first_num, sec_num) {return(first_num + sec_num)}print(add_two_numbers(11, 10))print(add_two_numbers(8, 20))
Now, to execute the above program in Python, we can use the robjects
submodule from the rpy2
package in Python as follows:
import rpy2.robjects as robjectsrobjects.r('''add_two_numbers <- function(first_num, sec_num) {return(first_num + sec_num)}print(add_two_numbers(11, 10))print(add_two_numbers(8, 20))''')
Line 1: We import the rpy2.robjects
module. This module serves as a bridge, enabling Python to communicate with R.
Lines 3–10: The triple-quoted block, following robjects.r()
contains R code that will be executed within the R environment connected through rpy2
.
In this example, we’re defining an R function named add_two_numbers
, designed to add two numbers.
The function takes two parameters, first_num
and sec_num
, and returns their sum.
Subsequently, we use the print()
function from R to display the result of calling add_two_numbers
twice with different input values.
First, we call add_two_numbers(11, 10)
, which calculates and prints the sum of 11
and 10
.
Then, we call add_two_numbers(8, 20)
, which computes and displays the sum of 8
and 20
.
Similarly, we can import R datasets into Python using the datasets
subpackage in the example below:
from rpy2.robjects.packages import importr, datasubpkg_datasets = importr('datasets')mtcars_data_from_r = data(subpkg_datasets).fetch('mtcars')['mtcars']print(mtcars_data_from_r)
Line 1: We import specific components from the rpy2.robjects.packages
module. This module enables us to access R packages and their functionalities from Python.
Line 3: We then employ the importr()
function to load the R package named 'datasets'
into our Python environment. This function establishes a connection to the R package, granting us access to its contents.
The variable subpkg_datasets
now holds a reference to the 'datasets'
package that we’ve imported using importr()
.
Line 4: We use the data()
function from rpy2.robjects.packages
to interact with the datasets available within the loaded R package.
By providing subpkg_datasets
as an argument, we specify that we want to work with the datasets contained in the 'datasets'
package.
With the fetch()
method, we retrieve the 'mtcars'
dataset from the R 'datasets'
package. The result is a dictionary-like structure that holds the dataset’s content.
We access the 'datasets'
dataset within the retrieved dictionary using ['mtcars']
.
The obtained dataset, referred to as mtcars_data_from_r
, is now available in our Python environment. This dataset contains information about various car models and their attributes, originally sourced from the R environment.
Line 5: To illustrate the outcome, we utilize Python’s print()
function to display the content of the mtcars_data_from_r
dataset. This showcases the car-related data, including details like “miles per gallon (mpg)”, “horsepower (hp)”, and more.
In conclusion, calling R scripts into Python code offers flexibility for our programming tasks. Combining the two languages opens new and exciting ways to perform efficient and dynamic data analysis.
Free Resources