When writing PL/SQL code for Big Data applications it’s always a good idea to leverage the database as much as possible.  This is particularly true for ETL or warehousing processing.

In my experience, I’ve tried to quantify objects in Oracle programming that work best to temporarily  store data.  As part of your Big Data application, anytime you need to leverage relational databases, here is a table that I typically use as a guideline for storing data for loading to or extracting data from a database.

Typically, I judge the what type of cursor object to used based on factors such as performance, the number of rows that need to be read, the operation that I am performing and how much memory that I would like to leverage.

Number of Rows Explicit Cursor  Temporary Table PL/SQL Memory Collections Record Collections  Typical Memory Used
<= 1000 check_mark7 check_mark7 10 MB
1000 to 500,000 check_mark7 check_mark7 check_mark7 100-500 MB
500,00 to 2,000,000+ check_mark7 check_mark7 500MB +

Although, explicit cursors can be used for much larger data sizes, if performance is your goal, a few thousand rows are optimal to maintain good performance, especially if you are opening and closing many cursors in a small window of time.