Michael's Software Thoughts & Ramblings: More OpenLDAP

I tested OpenLDAP multimaster replication some more today. I went back to the official relase version 2.4.8. I ran my test on an AMD 64 bit machine with both servers on the same box--different ports and different installation folders. Things ran fine for the most part. Synchronization was consistent in loading and removing 5000 entries like this:


dn: cn=Fred XX_0,dc=mgm,dc=com
objectClass: inetOrgPerson
objectClass: top
givenName: Fred
sn: XX
cn: Fred XX_0

I don't know much about the berkeley db, but my issues seemed to happen after abrupt shutdowns and what I think are database corruptions. slapd cored on each server in the same place at different times during the testing:

Server 1 Core:


#0  is_ad_subtype (sub=0x0, super=0x832430) at ad.c:489
#1  0x00000000004213a7 in attrs_find (a=0x936998, desc=0x832430) at attr.c:647
#2  0x00000000004352bf in test_ava_filter (op=0x41801640, e=0x914a18, ava=0x41800fb0, type=163) at filterentry.c:617
#3  0x0000000000435761 in test_filter (op=0x41801640, e=0x914a18, f=0x41800fd0) at filterentry.c:88
#4  0x0000000000480d9b in bdb_search (op=0x41801640, rs=0x41800ec0) at search.c:845
#5  0x0000000000470be2 in overlay_op_walk (op=0x41801640, rs=0x41800ec0, which=op_search, oi=0x87b2b0, on=0x0) at backover.c:653
#6  0x00000000004710d5 in over_op_func (op=0x41801640, rs=0x41800ec0, which=op_search) at backover.c:705
#7  0x000000000046a56e in syncrepl_entry (si=0x8a31b0, op=0x41801640, entry=0x90add8, modlist=0x418015a8, syncstate=1,
   syncUUID=, syncCSN=0x0) at syncrepl.c:1989
#8  0x000000000046c395 in do_syncrep2 (op=0x41801640, si=0x8a31b0) at syncrepl.c:844
#9  0x000000000046de8c in do_syncrepl (ctx=0x41801df0, arg=) at syncrepl.c:1226
#10 0x000000000041a692 in connection_read_thread (ctx=0x41801df0, argv=) at connection.c:1213
#11 0x00000000004fa6f4 in ldap_int_thread_pool_wrapper (xpool=0x83c890) at tpool.c:625
#12 0x00000035f1e06407 in start_thread () from /lib64/libpthread.so.0
#13 0x00000035f12d4b0d in clone () from /lib64/libc.so.6

After this happened, a restart of slapd would always hang on either server with this stack trace:


#0  0x00000035f1e076dd in pthread_join () from /lib64/libpthread.so.0
#1  0x00000000004e4501 in syncprov_db_open (be=0x8a27c0, cr=) at syncprov.c:2632
#2  0x0000000000470868 in over_db_func (be=0x8a27c0, cr=0x7fffad390610, which=) at backover.c:62
#3  0x0000000000427350 in backend_startup_one (be=0x8a27c0, cr=0x7fffad390610) at backend.c:224
#4  0x000000000042761a in backend_startup (be=0x8a27c0) at backend.c:316
#5  0x000000000040505a in main (argc=4, argv=0x7fffad3908d8) at main.c:932

I'm happy to see Gavin's interest in my blog. It shows his dedication--a tribute to the OpenSource community.

3 comments:

Gavin Henry said...: Did you file an ITS (http://www.openldap.org/its)? We need that you know ;-)

I find your posts and others via serveral feeds. Yours came up via:

http://feeds.technorati.com/search/openldap

--
Kind Regards,

Gavin Henry.
OpenLDAP Engineering Team.

E ghenry@OpenLDAP.org

Community developed LDAP software.

http://www.openldap.org/project/; May 15, 2008 at 12:35 PM
Michael Martin said...: Hi Gavin,

Thanks for you interest in my blog.

I tested OpenLDAP 2.4.9, and I no longer see the stability issues with the slapd process I was seeing in 2.4.8. I talk about it more here: openldap-249-multi-master-replication; May 16, 2008 at 10:18 AM
Gavin Henry said...: Excellent news.; May 17, 2008 at 1:45 PM

Wednesday, February 27, 2008

More OpenLDAP

3 comments: