Saturday, September 22, 2007

Reading files in a directory

Fairly typical task, one can find some java based solutions at AskTom website. This post illustrates how to to get the listing using relatively modern extensions of Oracle RDBMS: dbms_scheduler and external tables. I'm using running on RH Linux.

The idea is simple: we execute an external OS command and then get its output back via external table interface.

First thing we have to do is to come up with a command, I'm using a simple shell script:

[ -z "$1" -o -z "$2" ] && exit 1
/usr/bin/find "$1" -type f -maxdepth 1 -ctime -1 -printf "%f %CY%Cm%Cd%CH%CM%CS\n" > "$2"

As you can see the script accepts two arguments: first is a directory, second a file that then will be queried as an external table.

Then we create an external table:

CREATE TABLE get_dir_context (
  filename    VARCHAR2(1024)
, ctime       DATE
(TYPE oracle_loader
    filename   CHAR(1024)
  , ctime      DATE "YYYYMMDDHH24MISS"
 LOCATION ('listing.txt')

And eventually, execute the external command using dbms_scheduler:

  l_job_name                  VARCHAR2(30) := 'GET_DIR_CONTENT';
  l_command                   VARCHAR2(30) := '/tmp/';
  l_directory                 VARCHAR2(1024) := '/tmp';
  l_outputfile                VARCHAR2(1024) := '/tmp/listing.txt';
    job_name          => l_job_name
  , job_type          => 'EXECUTABLE'
  , job_action        => l_command
  , enabled           => FALSE
  , number_of_arguments => 2
  , auto_drop         => TRUE
    job_name          => l_job_name
  , argument_position => 1
  , argument_value    => l_directory
    job_name          => l_job_name
  , argument_position => 2
  , argument_value    => l_outputfile

One can use these commands to monitor the execution and fetch the content:

SELECT job_name, status, run_duration, actual_start_date, additional_info
  FROM user_scheduler_job_run_details;

COLUMN filename FORMAT A30
SELECT * FROM get_dir_context;

EXEC dbms_scheduler.purge_log; 

In this example the script is placed into a /tmp directory but make sure you secure the scripts used in production!

No comments: