Wednesday, July 16, 2008

(0) comments

Dominant does not mean all-encompassing

Oracle's Nishant Kaushik took some heat ("Is AD really the dominant Identity Store out there?") for not caving in to the "Active Directory is everywhere" litany. Bravo!

What some off his detractors fail to realize is that there are few, if any, organizations with more than 100 users who use AD as their sole identity datastore. Identity data - which includes not only name, rank and serial number, but also all of the attributes associated with the identity - is stored in myriads of places which can be local to the user, somewhere in the enterprise, or out in the internet cloud. And AD has no mechanisms whatsoever for getting at that data.

A service or application which wishes to consume identity data could search all possible datastores - provided, of course, it knew where they were and what protocols they supported for exporting data. How much easier, though, for the application developer to hit one datastore for everything that's needed? That should call forth no arguement from the AD-boosters - that's the argument they're using. But, as I said, AD has no way to get the data out of all of those other datastores. One thing does, though - the virtual directory.


Labels: ,

Wednesday, July 09, 2008

(4) comments

Smoke, mirrors - and numbers

RSA's Matt Flynn has been participating in the virtual directory/metadirectory conversation for some time, but his entry for today brings in more smoke and less clarity. Having been called out by Clayton Donley, Matt ripostes:

Also, it sounded like Clayton took my comments to mean that "everyone needs to be using Active Directory for everything", which was (I think obviously) not the intent. My point is that although the top 500 or 1000 companies may have a number of directories for various strategic uses, there are probably 20x that number of companies that use only Active Directory as the central and primary user store...
Now the problem here is in the numbers - the "top 500 ...companies" might harbor 5 million+ users. The "20x that number" (or, say, 10,000 companies) might total 50,000 users. Or, in other words, 1% of the total users are in all-AD environments, 99% are in heterogeneous situations. Which actually proves Clayton's point and refutes Matt's.

Additionally, of course, as long as most vendors (and most enterprises) make it so difficult to extend the schema of the central repository (whenever there is one) there will always be a need for a virtual repository for applications to use. The need for, and uses of, virtual directories is growing and is still a few years away from peaking.

Labels: , ,

Monday, July 07, 2008

(2) comments

A clueless manifesto

A big tip o'the hat to Jeff Bohren for drawing my attention to this note from Alex Karasulu of the ApacheDS project. Now remember, he's working on a Directory Server project. Yet he says:

The VD [Virtual Directory] implementations of today like Penrose, are just hacks without a formal computational basis to them. People trying to get a product to market rapidly to sell a company. We intend to enable virtualization eventually with a solid footing in the LDAP administrative model using this concept of a view. Views, as well as triggers/SPs will enable new ways to easily solve the problems encountered in the identity space. As a teaser just think what could be done in the provisioning space if AD supported triggers? Real technology will yield solid reliable solutions instead of these band aids we’re seeing during this identity gold rush.
Too bad he's not aware of Radiant Logic, Symlabs and the Oracle (nee OctetString) virtual directories - all of which have been around longer than ApacheDS and all of which support triggering mechanisms either through straight SQL or through policy implementations. They're pretty good with "views," also. I'm still looking for that "trigger" mechanism in the LDAP model!

Labels: , , ,

Friday, May 16, 2008

(0) comments

New tricks and old tools

Kim Cameron follows up on Clayton Donley's post with some thoughts of his own. And ends by quoting Clayton:
"The real solution here is a combination of virtualization with more standardized publish/subscribe for delivery of changes. This gets us away from this ad-hoc change discovery that makes meta-directories miserable, while ensuring that the data gets where it needs to go for transactions within an application."

and adding: " As soon as applications understand they are PART OF a wider distributed fabric, they could propagate changes using a publication pattern that retains the closed-loop verification of self-converging metadirectory. "

I couldn't agree more with these two erudite gentlemen.

Unfortunately, today's applications, and especially yesterday's applications still hanging around on our networks, but even tomorrow's applications for some time to come won't be written to be a part of a "wider distribution fabric," especially as that fabric doesn't yet exist in any meaningful way. And, as Kim said in an earlier posting, "Here’s the problem. Infrastructure people cannot dictate how application developers should build their applications. " We can build the infrastructure that will excel in a publish-subscribe world, but getting the apps developers to buy in to that model, well, that's something else. I'm all for building the infrastructure and plumbing of the future, but we need to adapt today's tools so that we can get the job done while waiting for the new plumbing.

Labels: , , ,

Monday, May 12, 2008

(2) comments

optimization and expense

Neil Macehiter comments on the last post:

But the issue is not with the language you use to perform the query: it's where the data is located. If you have data in separate physical databases then it's necessary to pull the data from the separate sources and join them locally. So, in Kim's example, if you have 5000 employees and have sold 10000 computers then you need to pull down the 15000 records over the network and perform the join locally (unless you have an incredibly smart distributed query optimiser which works across heterogeneous data stores). This is going to be more expensive than if the computer order and employee data are colocated.

The "expense" is there no matter how you do it. Putting all of your potentially useful data in one RDBMS is incredibly wasteful of storage space and comes at the cost of slowing down all queries. It also means that synchronizations need to be done almost constantly in order for the most up to date data to be available, a network "expense". But the search can be optimized before any data is pulled. For example, query the HR database for the lowest employee number issued after the first date you're interested in (assuming that employee numbers are issued sequentially). Then query the orders for PC purchases by that employee number or higher. Yes, it's two steps, but it's also faster than pulling down all the records to do a local join. And, I hold, less "expensive" than maintaining a huge silo of all potentially useful data.

Labels: , , ,

(1) comments

Getting more violent all the time

The distinguished Mr. Cameron has restated what he thinks is our major disagreement over synchronization and replication of identity data on the so-called "identity bus." He says:

"Sometimes an application needs to do complex searches involving information 'mastered' in multiple locations. I’ll make up a very simple 'two location' example to demonstrate the issue:

'What purchases of computers were made by employees who have been at the company for less than two years?'

Here we have to query 'all the purchases of computers' from the purchasing system, and 'all employees hired within the last two years' from the HR system, and find the intersection.

Although the intersection might only represent a few records, performing this query remotely and bringing down each result set is very expensive. No doubt many computers have been purchased in a large company, and a lot of people are likely to have been hired in the last two years. If an application has to perform this type of query with great efficiency and within a controlled response time, the remote query approach of retrieving all the information from many systems and working out the intersection may be totally impractical.

Compare this to what happens if all the information necessary to respond to a query is present locally in a single database. I just do a 'join' across the tables, and the SQL engine understands exactly how to optimize the query so the result involves little computing power and 'even less time'. Indexes are used and distributions of values well understood: many thousands of really smart people have been working on these optimizations in many companies for the last 40 years."

What Kim fails to note, however, is that a well designed virtual directory (see Radiant Logic's offering, for example) will allow you to do a SQL query to the virtual tables! You get the best of both: up to date data (today's new hires and purchases included) with the speed of an SQL join. And all without having to replicate or synchronize the data. I'm happy, the application is happy - and Kim should be happy too. We are in violent agreement about what the process should look like at the 40,000 foot level and only disagree about the size and shape of the paths - or, more likely, whether they should be concrete or asphalt.

Labels: , ,

Friday, April 11, 2008

(1) comments

A herring of a different color

You almost had me, Kim. I read your latest entry and was ready to share that olive branch. Right up to the last paragraphs when you say (about me):

"...He keeps saying I propose 'a directory that gathers and holds ALL the data from ALL your other directories.' Dave, this is just untrue and unhelpful. “ALL” was never the goal - or the practice - of metadirectory, and you know it. The goal was to represent the 'object core' - the attributes shared across many applications and that need therefore to be kept consistent and synchronized if stored in multiple places. Our other goal was to maintain the knowledge about what objects 'were called' in different directories and databases (thus the existence of 'connector space').

Basically, the ”ALL” argument is a red herring..."

Not at all. Let's step back a pace or two, or a posting or two, and think about the reasons for having this meta/virtual directory. Yes, it helps to normalize the data and keep it in sync. But if that were all, than a couple of keyboard monkeys could handle the chore and, at least in the case of normalization, could do it more quickly than a semi-automated process.

But the real reason we want to do this is so that identity data is available to applications. Available to them using a single vocabulary and a single protocol. Not that there can't be multiple vocabularies and protocols, but any one application would only need to use one of each - each application programmer would only need to use the vocabulary and protocol she was most familiar with.

But for this to be effective, the programmer needs to know that any identity data they need is available through this mechanism. And the only way any data can be available is if all data is available. The identity data must be pervasive and ubiquitous - available whenever and wherever you need it.

From the application's point of view, it should appear to be a single silo but in reality, the data will be distributed throughout the fabric of the network both within and without the enterprise, the identity provider or other data store.

The promise of the meta/virtual directory is that it can serve up the current, correct data on demand from wherever it resides. And to do that, it has to aim to provide all identity data.

Now, to forestall some people, let me add that the security of this system is a given- there need to be strict and fine-grained access controls for the data. There need to be well designed mechanisms allowing for whoever controls a bit of data to authorize its release. Without these things the system is useless because no one would use it.

But this systems needs to aim to have available all identity data, every conceivable bit of it. Because without that, the application programmer can't be sure that the bit he needs is there and so will set up alternative storage for the bits that that application needs.

We're not there yet, but we need to go that way.

Labels: , , ,

Wednesday, April 09, 2008

(0) comments

Your mother was a hamster and your father smelt of elderberries!

Here I'd thought I'd offered Kim Cameron a bit of an olive branch in the virtual/meta/uber directory discussion. But did he take it? Yes, he did, then attempted to whack a bunch of folks about the head and shoulders with it!

In a further attempt to clarify what he meant, Kim says:

"By 'next generation application' I mean applications based on web service protocols. Our directories need to integrate completely into the web services fabric, and application developers must to be able to interact with them without knowing LDAP."

Why Kim feels that LDAP is beyond the ken of today's application developers is beyond me, but the darker part of this is that he seems to say that only through the use of the Microsoft-controlled WS-* protocols (you can read their propaganda at their web site) can this be achieved. Nonsense.

Still, if any developers feel that only XML based scripting is acceptable to use, then I'd suggest they consider the very good LDAP replacement, DSML which has, sadly, languished for a number of years. Or there's SPML (for provisioning services). Even XACML could be used (although it would need a bit more work). The point is that there are open protocols, openly arrived at, that will do the job and today's application designers are bright enough to know how to use them.

I'm reminded by Phil Hunt's post on this issue that his work on the Identity Governance Framework, now an OpenLiberty project, also satisfies the requirement of open protocols, openly arrived at.

Labels: , , , ,

Monday, April 07, 2008

(1) comments

The blind philosophes of Identity

Kim has now responded ("Through the looking glass") to my Humpty Dumpty post, and we're beginning to sound like a couple of old philosophes arguing about whether or not to include "le weekend" and "hamburguer" and other Franglais in the French dictionary.

We really aren't that far apart.

In his post, Kim recalls launching the name "metadirectory" back in '95 with Craig Burton and I certainly don't dispute that. In fact, up until 1999, I even agreed somewhat with his definition:

"In my world, a metadirectory is one that holds metadata - not actual objects, but descriptions of objects and their locations in other physical directories."

But as I continued in that Network World column:
"Unfortunately, vendors such as Zoomit took the term 'metadirectory' and redefined it so it could be used to describe what I'd call an überdirectory - a directory that gathers and holds all the data from all your other directories."

Since no one took up my use of "uberdirectory," we started using "metadirectory" to describe the situations which required a new identity store and "virtual directory" for those that didn't.

So perhaps we're just another couple of blind men trying to describe an elephant.

Labels: , , ,

Wednesday, April 02, 2008

(0) comments

Get on the bus!

Everybody else is. Dale Olds has commented. So has Phil Hunt. Let's all get together at the European ID Conference in Munich later this month and talk about the Identity Hub, the Identity Bus, the death of the metadirectory and so much more. Suggestions for a suitable meeting place (i.e., biergarten) near the Deutsches Museum are welcome - post as comments to this post.

See you there!

Labels: , , , ,

Wednesday, March 26, 2008

(0) comments

Meta-directories? Your father's ID store...

Kuppinger Cole's Felix Gaehtgens posts today ("Meta-directories? I’d say quaint, but not quite dead.") on the demise of the metadirectory and the rise of virtualization. Felix should know, he's formerly the VP at Symlabs, a major Virtual Directory provider. He says:

"Microsoft has made an investment into that technology by rewriting MIIS pretty much from scratch. And Siemens to this date probably has the most comprehensive and advanced meta-directory implementation with its DirXmetahub component that is part of its Dir-X offering. Nevertheless, meta-directories are arguably still around mostly because Microsoft forces this technology onto its customers for what I think are political reasons: Several people working for Microsoft in the field have told me that is was in Microsoft’s interest to have Active Directory as a central component, and believe it against Microsoft’s interest to have a “filtered access”, such as a virtual directory in front of AD, abstracting information away from what should be the authoritative source. I never really understood this fear, but recently it seems that this brick wall may be slowly starting to crumble."
Read the rest of his post for a synthesis of the argument Kim and I have been having, a synthesis that could be close to a solution.

Labels: ,

Tuesday, March 25, 2008

(1) comments

with Kim Cameron as Humpty Dumpty...

One of my favorite passages from Lewis Carroll is the dialog in "Through the Looking Glass" between Alice and Humpty Dumpty:

"There's glory for you!"
"I don't know what you mean by 'glory,' " Alice said.
Humpty Dumpty smiled contemptuously. "Of course you don't—till I tell you. I meant 'there's a nice knock-down argument for you!' "
"But 'glory' doesn't mean 'a nice knock-down argument,' " Alice objected.
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean—neither more nor less."
"The question is, " said Alice, "whether you can make words mean so many different things."
"The question is," said Humpty Dumpty. "which is to be master—that's all."

Kim responded to yesterday's post in the "metadirectory" discussion with a Humpty Dumpty answer. He starts off with a Cameronesque peace offering ("It seems like some of our disagreement is a matter of terminology.") He then goes on to re-define "metadirectory" so that it becomes the answer to his question:

"Let’s make it clear that I see metadirectory as an evolving thing.

* First generation metadirectory dealt exclusively with a managing applications that had been conceived without reference to each other - or to any common framework (In truth, this is still an issue - see Jeff Bohren’s recent posting called “Which is better, Phillips or Flat-head?“).

* Second generation metadirectory has an additional focus: providing the framework by which next-generation applications can become part of the distributed data infrastructure. This includes publishing and subscription. But that isn’t enough. Other applications need ways to find it, name it, and so on. "
First to Jeff's posting. It's lovely. But it doesn't address the question. The application developer only cares about knowing how to access the data that the application needs. What form or format it's stored in doesn't make any difference. If the application developer only has SQL as the means of accessing data, then this puts the developer in the role of someone with a Phillips-head screwdriver trying to remove flathead screws, not the identity architect who provides multitudes of access protocols and methods for the identity data.

Kim talks about a "second generation" metadirectory. Metadirectory 2.0 if you will. First time I've heard about it. First time anyone has heard about it, for that matter. There is no such animal. Every metadirectory on the market meets the definition which Kim provides as "first generation". It's time to move on away from the huge silo that sucks up data, disk space, RAM and bandwidth and move on to a more lithe, agile, ubiquitous and pervasive identity layer. Organized as an identity hub which sees all of the authoritative sources and delivers, via the developer's chosen protocol, the data the application needs when and where it's needed.

I think, I hope, that Kim will agree with me that this ID layer (the "ID bus") instituted as a hub (or transformation device) is what we need to go forward. I'm not wedded to calling it the Virtual Directory, but I'm certainly not going to call it the metadirectory, either.

Michel Prompt (who Kim quotes extensively) calls it the "context server." I can certainly live with that.


Monday, March 24, 2008

(0) comments

It's unsanitary, Kim!

In a blog entry today, Kim Cameron both puts words in my mouth and twists the ones that come out to serve his "straw man" purpose.

In commenting on my recent post about the death of the metadirectory, he says: "Who would want to get in the way of Dave’s metaphors? He’s on a streak. But he’s making a fundamental mistake, taking an extreme position that is uncharacteristically naive."

What did I do? I advocated the virtual directory as the better vehicle for all of the ID data needed in the SaaS world.

Kim implies that, somehow, I called for the virtual directory to be authoritative. That's simply not so. the virtual directory is merely the conduit to the authoritative source, wherever it might be. The application developer doesn't even need to know the authoritative source of the data - or need to re-write code if that source changes.

But then he goes on to say: "Application developers like to use databases and tables. They have become expert at doing joins across tables and objects to produce quite magical results. As people and things become truly first class objects in our applications, developers will want even more to include them in their databases."

I couldn't agree more. As a developer, I always prefer to have a local cache of the data I need in a (for me) easily manipulated data structure. But that does not mitigate against the use of a virtual directory. Far from it. The application database (for those who cling to it like Linus and his blanket) now can serve two purposes - one to subscribe to virtual directory data and one to publish!

The application database is the authoritative source of the application-generated data, and should be linked to the virtual directory which will consume this data and make it available for other applications and services. At the same time, any data which the application consumes - but which it is not authoritative for - can be populated at run-time from the virtual directory. For the developer who thinks this is a performance hit (and for whom accuracy is less important than an extra millisecond), a "synchronization stored procedure" would handle data changes without stealing precious time from the user-application interaction. It really is win-win.

Now the argument could be made that a synchronization engine (such as in a provisioning system) could periodically update all of the various datastores with any new or changed identity data, but that simply takes the well-known synchronization problems of the metadirectory and magnifies them by the dozens, hundreds or thousands of application datastores within the organization. That's a recipe for disaster. If an individual developer, for an individual application, wishes to sacrifice accuracy and risk the potential of error caused by out-dated data, or data whose location has changed in the hope of a spurious speed improvement (almost immediately unnoticeable due to the fluctuating nature of network thruput), they'll quickly learn, I believe, that "haste makes waste."

The further error Kim makes, though, is to believe that a virtual directory can't look like a SQL database to the application (or an XML database for web services developers). The folks at Radiant Logic would certainly disagree. It's all about the context. I'd invite Kim, and other skeptics, to our sessions on Identity and Context (including one about context and user-centric identity, as well as context and virtual directories) at next month's European Identity Conference in Munich.

Labels: , ,

Friday, March 21, 2008

(0) comments

Killing the Metadirectory

Kim Cameron comments today about my column ("Is the metadirectory dead?") which was inspired by Kim's erstwhile colleague Jackson Shaw's blog entry ("You won't have me to kick around anymore!") which included the lines: "Let's be honest. The meta-directory is dead. Approaches that look like a meta-directory are dead."

My interpretation is that the metadirectory has finally given way to the virtual directory as the synchronization engine for identity data. Kim interprets it differently. He talks about the "Identity Bus" and says that " still need identity providers. Isn’t that what directories do? You still need to transform and arbitrate claims, and distribute metadata. Isn’t metadirectory the most advanced technology for that? " And I have to answer, "no." The metadirectory is last century's technology and it's day is past.

The Virtual Directory, the "Directory as a Service" is the model for today and tomorrow. Data that is fresh, always available and available anywhere is what we need. The behemoth metadirectory with it's huge datastore and intricate synchronization schedule (yet is never quite up to date) are just not the right model for the nimble, agile world of today's service driven computing. But the "bus" Kim mentions could be a good analogy here - the metadirectory is a lumbering, diesel-spewing bus. The virtual directory? It's a zippy little Prius...

Labels: , ,

Thursday, August 30, 2007

(0) comments

Where's my data?

Dale Olds had an interesting post today ("The physical location of data matters") and Vikram Kumar had an interesting commentary ("Data location matters").

Dale states his thesis:
"The problem is that there can be very subtle problems in these systems based on where a policy is actually stored, who can access the policy, what is the security for retrieving the policy, etc.

And the slogan sounds very silly. It is 'the physical location of the data matters'."

And Kumar injects that:
"For many non-Americans top of mind when they think about the physical location of their data is the USA Patriot Act. This law presents two particularly thorny issues regarding their data stored in the US."

We sometimes lose sight of these issues as we strive to make data access as seamless as possible. From the beginnings of the virtual directory a dozen years ago to today's meshed software and mashed up services, the actual location of the data doesn't matter to the operation, to the transaction, to the application, to the service. But it terms of security and privacy (and even intellectual property) it might make a great deal of difference where the actual data resides.

Something to take into consideration.

Labels: , ,

Friday, January 19, 2007

(0) comments

Novell FUDdy duddy...

Sometimes the people who pull the biggest bonehead plays are those who should really know better. Novell's Volker Scheuber has attempted to explain what's wrong with virtual directory systems vis-a-vis metadirectory systems. Note that Novell is not currently offering a virtual directory system - even though they had the advantage over everyone in launching one, they just couldn't agree (both politically and technically) and how to get it done. It eventually got so bad that the product manager, Samm DiStasio, up and left for Redmond (where he's now director of the Windows Server Product Management Group at Microsoft)!

Scheuber states:
  • All data is always available as long as the central identity vault is available. In a virtual directory implementation, some of the delegated data source may not be available and requests may return no or only incomplete data.
  • A central identity vault is usually easier made high-available and fault-tolerant than a conglomeration of separate data stores.
  • In heavy load/request environments the identity vault absorbs all client requests thus protecting the backend systems from having to handle the whole load.

While that may have been true 10 years ago when Novell was first developing what became DirXML, today's Virtual Directory uses what can be characterized as proxy technology to handle all of these situations. As Radiant Logic, one of the major providers of virtualized directory services, puts it, today's virtual directory can "...access data sources dynamically and integrate on-the-fly, or use synchronization services and integrate at the back-end in conjunction with virtualization; store identities in the internal directory store for stand alone directory service, or write back to another directory or database."

Any technology needs to be able to withstand legitimate criticism. But the sort of FUD that Scheuber is spreading neither helps him, his organization or the industry. And it certainly does no good for the potential customer.

UPDATE: Matt Flynn, reacting to the Novell posting, goes into great detail as to why the hybrid model is superior.

Labels: , ,

Tuesday, January 16, 2007

(4) comments

A virtual solution

Conor and Eve's responses to "Putting ID all together" correctly note that the Liberty spec is 'location agnostic' about data. I'll even agree with Eve when she states

"If all you’re storing is self-asserted info about you personally, then sure, it’s handy to consolidate all of it in one place over which you have direct control, whether that’s a traditional web app/service, a device you carry on your person, etc. But as soon as you get into information that someone else has the right to own (including mundane things like your employment status, which comes up a lot when you, say, apply for loans), I can’t see their being okay with giving you the “gold copy” to hold. That’s where multi-sourcing really shows its stuff."
(This in response to Dick Hardt's assumption that the user, having chosen an identity provider [in that discussion, an OpenID Provider or OP], would happily entrust everything about themselves to that one OP and wants all relying parties to upload any interesting facts about the user back to the OP. )

But there is, of course, a third way. And one I think is a better way. It's the tried and true "virtual directory." Data is consolidated into a repository controlled by the user. Applications query that repository for data. But the authoritative source for that data may well lie somewhere else (e.g., Eve's "employment status" data point). All that's needed is a synchronizing join engine (something the folks at Oracle, Radiant Logic, Symlabs, MaXware and the Penrose Project are very familiar with) with a new frontend or two to support attribute exchange via Liberty protocols, WS-* or even OpenID.

Labels: ,

© 2003-2006 The Virtual Quill, All Rights Reserved


[Powered by Blogger]