Tuesday, November 01, 2005

PL/SQL implementation of the crc32()

Recently I was kindly asked to share PL/SQL implementation of the crc32() I developed once. I slightly polished and optimized it. NO WARRANTY is implied. Please let me know if you find any errors.

/***********************************************************************
** These functions calculate crc32
** calculate -- this one is PLS_INTEGER based, some simple modifications
** are needed to make it working on < 9i editions
** calculate10 -- uses enriched Oracle 10g BITAND (INTEGER)
**
** a_data_i -- input string
** a_init_i -- crc32 initial value
**
** As an alternative one can consider using CRC32 java class
** http://java.sun.com/j2se/1.4.2/docs/api/java/util/zip/CRC32.html
** or to use dbms_crypto/dbms_obfuscation_toolkit's hash functions
*/
/* Example:

SQL> !perl -e 'use String::CRC32; print crc32("quick brown fox jumps over the lazy dog") . "\n";'

998814576

SQL> VAR crc32 NUMBER
SQL> EXEC :crc32 := crc32.calculate('quick brown fox jumps over the lazy dog', :crc32);
SQL> PRINT crc32

CRC32
----------
998814576

SQL> BEGIN
2 :crc32 := 0;
3 :crc32 := crc32.calculate('quick ', :crc32);
4 :crc32 := crc32.calculate('brown ', :crc32);
5 :crc32 := crc32.calculate('fox ', :crc32);
6 :crc32 := crc32.calculate('jumps ', :crc32);
7 :crc32 := crc32.calculate('over ', :crc32);
8 :crc32 := crc32.calculate('the ', :crc32);
9 :crc32 := crc32.calculate('lazy ', :crc32);
10 :crc32 := crc32.calculate('dog', :crc32);
11 END;
12 /

PL/SQL procedure successfully completed.

SQL> PRINT crc32

CRC32
----------
998814576

*/

7 comments:

Anonymous said...

This is your function redone so that it can be added inline and support clob. It also outputs the CRC as a Varchar2. I'll have to post in 3 messages since there is a 4095 character limit.

FUNCTION CRC32b (
a_data_i IN CLOB,
a_init_i IN INTEGER := 0)
RETURN VARCHAR2 IS
TYPE crc32_t IS VARRAY (256) OF INTEGER NOT NULL;
TYPE hex_t IS TABLE OF PLS_INTEGER INDEX BY VARCHAR2(2);

hex hex_t;
g_raw RAW(32767);
xFFFFFFFF CONSTANT INTEGER := 4294967295;
xFFFFFF CONSTANT INTEGER := 16777215;
x10000 CONSTANT INTEGER := 65536;
xFF CONSTANT PLS_INTEGER := 255;
x100 CONSTANT PLS_INTEGER := 256;
l_chr PLS_INTEGER;
l_idx PLS_INTEGER;
l_crc INTEGER;
l_len PLS_INTEGER;
l_lop PLS_INTEGER;
l_rop INTEGER;
l_lop_h PLS_INTEGER;
l_lop_l PLS_INTEGER;
l_rop_h PLS_INTEGER;
l_rop_l PLS_INTEGER;
l_rem PLS_INTEGER;

Anonymous said...

crc32 CONSTANT crc32_t := crc32_t (
0000000000, 1996959894, 3993919788, 2567524794, 0124634137, 1886057615, 3915621685, 2657392035,
0249268274, 2044508324, 3772115230, 2547177864, 0162941995, 2125561021, 3887607047, 2428444049,
0498536548, 1789927666, 4089016648, 2227061214, 0450548861, 1843258603, 4107580753, 2211677639,
0325883990, 1684777152, 4251122042, 2321926636, 0335633487, 1661365465, 4195302755, 2366115317,
0997073096, 1281953886, 3579855332, 2724688242, 1006888145, 1258607687, 3524101629, 2768942443,
0901097722, 1119000684, 3686517206, 2898065728, 0853044451, 1172266101, 3705015759, 2882616665,
0651767980, 1373503546, 3369554304, 3218104598, 0565507253, 1454621731, 3485111705, 3099436303,
0671266974, 1594198024, 3322730930, 2970347812, 0795835527, 1483230225, 3244367275, 3060149565,
1994146192, 0031158534, 2563907772, 4023717930, 1907459465, 0112637215, 2680153253, 3904427059,
2013776290, 0251722036, 2517215374, 3775830040, 2137656763, 0141376813, 2439277719, 3865271297,
1802195444, 0476864866, 2238001368, 4066508878, 1812370925, 0453092731, 2181625025, 4111451223,
1706088902, 0314042704, 2344532202, 4240017532, 1658658271, 0366619977, 2362670323, 4224994405,
1303535960, 0984961486, 2747007092, 3569037538, 1256170817, 1037604311, 2765210733, 3554079995,
1131014506, 0879679996, 2909243462, 3663771856, 1141124467, 0855842277, 2852801631, 3708648649,
1342533948, 0654459306, 3188396048, 3373015174, 1466479909, 0544179635, 3110523913, 3462522015,
1591671054, 0702138776, 2966460450, 3352799412, 1504918807, 0783551873, 3082640443, 3233442989,
3988292384, 2596254646, 0062317068, 1957810842, 3939845945, 2647816111, 0081470997, 1943803523,
3814918930, 2489596804, 0225274430, 2053790376, 3826175755, 2466906013, 0167816743, 2097651377,
4027552580, 2265490386, 0503444072, 1762050814, 4150417245, 2154129355, 0426522225, 1852507879,
4275313526, 2312317920, 0282753626, 1742555852, 4189708143, 2394877945, 0397917763, 1622183637,
3604390888, 2714866558, 0953729732, 1340076626, 3518719985, 2797360999, 1068828381, 1219638859,
3624741850, 2936675148, 0906185462, 1090812512, 3747672003, 2825379669, 0829329135, 1181335161,
3412177804, 3160834842, 0628085408, 1382605366, 3423369109, 3138078467, 0570562233, 1426400815,
3317316542, 2998733608, 0733239954, 1555261956, 3268935591, 3050360625, 0752459403, 1541320221,
2607071920, 3965973030, 1969922972, 0040735498, 2617837225, 3943577151, 1913087877, 0083908371,
2512341634, 3803740692, 2075208622, 0213261112, 2463272603, 3855990285, 2094854071, 0198958881,
2262029012, 4057260610, 1759359992, 0534414190, 2176718541, 4139329115, 1873836001, 0414664567,
2282248934, 4279200368, 1711684554, 0285281116, 2405801727, 4167216745, 1634467795, 0376229701,
2685067896, 3608007406, 1308918612, 0956543938, 2808555105, 3495958263, 1231636301, 1047427035,
2932959818, 3654703836, 1088359270, 0936918000, 2847714899, 3736837829, 1202900863, 0817233897,
3183342108, 3401237130, 1404277552, 0615818150, 3134207493, 3453421203, 1423857449, 0601450431,
3009837614, 3294710456, 1567103746, 0711928724, 3020668471, 3272380065, 1510334235, 0755167117);

Anonymous said...

BEGIN
FOR i IN 0..255 LOOP
hex(TO_CHAR(i, 'FM0X')) := i;
END LOOP;
l_crc := xFFFFFFFF - NVL(a_init_i, 0);
--g_raw := utl_raw.cast_to_raw(a_data_i);
l_len := LENGTHB(a_data_i);
FOR l_idx IN 1..l_len
LOOP
/* One should spend some time testing what's faster TO_NUMBER
or hash lookup. Most probably the difference is very minimal
especially when the native compilation is used. Left as is
on illustrative purpose.
*/
--l_chr := hex(RAWTOHEX(utl_raw.substr(g_raw, l_idx, 1))); -- TO_NUMBER is not used, we use hash look up
l_chr := hex(RAWTOHEX(utl_raw.cast_to_raw(SUBSTR(a_data_i, l_idx, 1)))); -- TO_NUMBER is not used, we use hash look up
l_lop := TRUNC(l_crc / x100);
l_rem := MOD(l_crc, x100);

l_rop := crc32(BITAND((l_rem - BITAND(l_rem, l_chr)) + (l_chr - BITAND(l_rem, l_chr)), xFF) + 1);

-- PL/SQL BITAND supports BINARY_INTEGER is an INTEGER in -2147483647..2147483647 range
l_lop_h := TRUNC(l_lop / x10000);
l_lop_l := MOD(l_lop, x10000);

l_rop_h := TRUNC(l_rop / x10000);
l_rop_l := MOD(l_rop, x10000);

l_crc := ((l_lop_h - BITAND(l_lop_h, l_rop_h)) + (l_rop_h - BITAND(l_lop_h, l_rop_h))) * x10000
+ (l_lop_l - BITAND(l_lop_l, l_rop_l)) + (l_rop_l - BITAND(l_lop_l, l_rop_l));
END LOOP;
RETURN SUBSTR(TRIM(TO_CHAR(xFFFFFFFF - l_crc, '0XXXXXXX')),1,8);
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.put_line ('Size of a_data_i:'||LENGTHB(a_data_i)||CHR(10)
||'Backtrace:'||dbms_utility.format_error_backtrace||CHR(10)
||'Stack:'||dbms_utility.format_error_stack);
RAISE;
END CRC32b;

Vladimir Begun said...

I am glad that the original code was helpful.

Not sure what does it mean: "so that it can be added inline". "It also outputs the CRC as a Varchar2." -- it's Ok, if you want to show it in HEX but it would be much proper to use FM as a TO_CHAR conversion format mask that would return a value with no leading or trailing blanks. That's minor though. But what you might want consider is to do less number of SUBSTR operations against the CLOB value, especially doing SUBSTR of one character. Instead use 32K string as a buffer. String operations against CLOB variable tend to be more expensive than against VARCHAR2.

Good luck.

Anonymous said...

Спасибо, очень пригодилось!
С уважением,
Сергей.
СПб

Vladimir Begun said...

Пожалуйста! :-)

Anonymous said...

The link for the zip file is broken... Could you please reupload it?