Abstract
1. Introduction
The OECD “has provided probably the best known set of guidelines” for the protection of privacy. The basic thesis in this paper is that the OECD privacy principles can be formulated on a firmer theoretical foundation in terms of a specific set of acts on personal information. This transforms them into a complete and precise system of rules for handling personal information. The extended version of this paper reviews privacy guidelines, regulations, laws, and technology, including the EU Privacy Directive, HIPAA, P3P, EPAL, and XACML.
2. Personal Information
For us, personal information (PI) is any linguistic expression that has referent(s) of type person. Accordingly, there are two types of PI:
- (1) Atomic personal information is an expression that has a single human referent. “Referent” in this case implies an identifiable (natural) person.
- (2) Compound personal information is an expression with more than one human referent. If x is a piece of atomic personal information of a person (the referent) then he/she is its proprietor. Compound PI is reducible to atomic PI.
3. Personal Information Flow Model (PIFM)
The personal information flow model (PIFM) divides functionality of handling of PI in terms of four phases: creating, collecting, processing, and disclosing PI (Al-Fedaghi, 2006), as shown in figure 1.

New PI is created by proprietors or non-proprietors (e.g., medical diagnostics by physicians) or is deduced by someone (e.g., data mining that generates new information from existing information). The created information is utilized in some usage (e.g., decision making), stored, or immediately disclosed. Processing the PI phase involves acting on (e.g., anonymizing, data mining, summarizing, translating) the PI. The disclosure phase refers to releasing the PI to insiders or outsiders.
The PI flow architecture contains 17 types of acts on PI labeled A through Q. These acts form ordered chains. For example, if we have two agents, 1 and 2, then A 1 F 1 B 1 O 2 I 2 D 2 B 2 O 1 E 1 means: Collected (A 1 ) personal information will be stored (F 1 ) and disclosed (B 1 ) to a credit company that collects (O 2 ) it to process (I 2 ) it to check credit and categorizes (D 2 , B 2 ) as OK/not OK. Upon collecting (O 1 ) the result of this checking, the personal information is used (E 1 ) for delivery .
Figure 2 shows the PI flow for this chain where irrelevant details in each region are omitted.

The rest of this paper is divided into seven sections; each section includes a detailed anatomy of each of the OECD principles. Space limitations allow for the presentation of the analysis of only the first of these principles. The basic purpose of our work is to provide a complete formulation of the underlining rules. In practice, a privacy policy can adopt all of the resultant subrules, generalize them, or even cross out some of them. The fine-grained subprocesses of PI handling provide an opportunity to tailor privacy policies according to the requirements of each subprocess.
4. The Collection Limitation Principle
The Collection Limitation Principle, denoted as OECD1, states that:
There should be limits to the collection of personal data and any such data should be obtained by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject.
We will analyze this principle in terms of the four phases of PIFM.
4.1 Collecting Phase
OECD1 uses the term “collection” to refer to the mere act of collecting PI. This is exactly the same as the first phase in PIFM. If this interpretation is correct, then we can ask, what about limits on other phases of PI handling, namely, the processing, creating, and disclosing phases in PIFM? For example, data mining techniques can produce new PI that is not collected. Why don’t we specify explicitly—as in the case of collection—that there are limits to the mining of PI and any such data should be performed by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject? The same question can be raised with regard to manual internal examination of data (e.g., psychological character analysis) that also can produce new PI. The point here is that limits can be specified on all phases of acting on PI, not just on the collection phase. Consequently, we will go through each phase to examine the issue of limiting acts on PI. Figure 3 shows the collection phase and its associated acts in PIFM.
The collection phase refers to collecting “raw” PI directly from an outside source. “Raw” means data as received by the gathering agent. Two notions are embedded in the OECD1 principle:
- (a) Limiting gathering of PI: e.g., lawful and fair.
- (b) Limiting the collecting, use, and storage of PI within the knowledge or consent of the proprietor These notions should be separated because they sometimes are not directly related. Obtaining PI by lawful and fair methods does not imply any consequences in methods of acting on PI. For example, I can collect PI lawfully and misuse it or not inform its proprietor. I can limit my use of PI even though I collected it unlawfully. The Limitation principle includes seven rules to be observed by agents according to the types of PI:
The first two rules concern a gathering agent and limit handling of PI at the collecting phase only:
- A gathering (collecting) agent should gather PI by lawful and fair means
- A gathering agent should collect, use, and store PI with the knowledge or consent of the proprietor.
4.2 Processing Phase
As shown in figure 4, processing of PI is performed on either collected information (I) or on created information (P). Created information is created by the agent itself, either manually (reporting as in the case of a newspaper reporter) or automatically (e.g., mining program). Two types of mining are shown in figure 4, “implied PI” (IPI) mining and “new PI” (NPI) mining. IPI mining generates implied PI (e.g., transitivity). NPI generates new PI, as in using categorization of other persons’ PI to generate that John is a risk. The figure also includes other types of processing that do not generate implied or new PI, but only change the appearance of PI as comparing, compressing, translating, etc.

We can extend the limitation to the processing phase:
- A processing agent should acquire PI by lawful and fair means.
- A processing agent should process, use, and store PI with the knowledge or
consent of the proprietor. Rule 4 covers all types of processing, including mining and non-mining processing .
4.3 Creating phase
In the creating phase, limitation aspects are shown in figure 5. Following the same procedure, we obtain the following rules.

Two additional rules are necessary to cover the creating phase:
- A creating agent should create PI by lawful and fair means.
- A creating agent should use and store PI with the knowledge and consent of the
proprietor. Notice that we have not specified that a creating agent should create PI with the knowledge and consent of the proprietor. Also, store here refers to formal permanent recording of PI, such as profiling. These rules cover new PI generated by a mining program. They also cover PI reported (created) by a police detective or a paparazzi. As we mentioned previously, in practice, a privacy policy can opt to restrict the application of these subrules. Restrictions on the rules can be easily added to specify “permitted” use of PI. For example, the EU privacy directive includes many exceptions for reasons ranging from security to public health. Notice how precise our model of rules is. In our system, if restrictions are specified in rules 4 and 5, then they are applied to creating PI, not to gathering, processing, or disclosing.
4.4 Disclosing phase
Creating, gathering, and processing phases have “input” that begins with obtaining PI and “output” that channels PI outside the phase. The disclosing phase is an “outlet” or a channel to flush PI out to others. Its “output” is already recorded as collected PI by these others. This is illustrated in figure 6. We end up with the 7th rule, which covers disclosing all types of PI.
7. A disclosing agent should disclose PI with the knowledge and consent of the proprietor.
References
Al-Fedaghi, S., Personal Information Model for P3P, W3C Workshop on Languages for Privacy Policy Negotiation and Semantics-Driven Enforcement, 17 and 18 October 2006, Ispra/Italy.
