The early history of the Rodden databank
(Source: Mark Pottenger )
From the summer of 1981 a research group in Los Angeles with members from ISAR (International Society for Astrological Research) and other organizations worked on a possible design for a database for astrologers. At a meeting on 26 October 1985 this group decided to simplify somewhat from rather more elaborate plans to work on something that could actually be implemented fairly quickly using data already collected by Lois Rodden. With design work by the research group, volunteer programming by Mark Pottenger, a lot of volunteer data entry and a some data entry paid for by ISAR, the Rodden-ISAR Databank was created. The first version of the RID considered complete was done by 13 February 1988 at 8:07 PM Los Angeles time in the Dobyns-Pottenger household. It had 7,465 birth data records with 17,151 coded entries in 302 category. People could order the data on printouts or diskettes. ISAR kept the RID active for several years with annual updates to the data by Lois Rodden. By the 1992 release, the RID had 29,837 entries in 305 categories for 14,130 people.
The next generation of the database was a joint project of ISAR and NCGR (National Council for Geocosmic Research), converting the RID data into IDEA (International Data Exchange for Astrology) in 1993, with programming in a proper database system by J. Lee Lehman.
Those 1980s ISAR meetings, RID and IDEA led to the AstroDatabank.
Article by Mark Pottenger in December 1992 of Kosmos
From Monster base to RID to IDEA
ISAR has been working for a long time to make quality data available to the astrological community. Several members of ISAR met intermittently in Los Angeles starting in the summer of 1981 to try to develop a system for coding and storing information for use in astrological studies. Feedback from conference attendees and others outside the working group was also solicited. The project went on for several years and resulted in a description of a possible database coding scheme which was too large and comprehensive for anyone to implement. The last formal revision of that structure was Version 1.1 on February 8, 1984. Some of us nicknamed this project the monster base.
At a meeting on October 26, 1985 a decision was finally reached to use a list of codes based on data Lois Rodden had already collected and categorized instead of trying to implement the monster base. This allowed us to begin actual implementation of a database on computer. Much of the basic outline of the RID (Rodden/ISAR Data bank) was set at that meeting.
After some delays from an unfinished attempt by another board member to get data entry set up on Apple II systems, data entry was actually started in 1987 with some quick and dirty programs set up by Mark Pottenger, research director of ISAR. As much data entry as possible was done by volunteers (especially massive amounts by Marguerite dar Boggia), and the rest of the data entry was paid for by ISAR.
The first version of the RID was available and the first catalog finished on February 14, 1988, with an announcement in the Winter 1987-1988 Kosmos . Data from the RID is primarily available in the form of printouts, although it can be ordered on disk in a few formats. The data stored and the printouts include basic birthdata (including Lois' rating of accuracy), codes for the categories assigned to the person or event, codes for related records in the database, alternate birthdata or notes about the data, and brief text notes about the person or event. The data includes many public figures and even more people collected for membership in a particular occupation or other category. (Much of the category data has a category label instead of a person's name.) Data ordered by category is priced at a few cents per record, since one of our aims is to promote the use of the data in research. Data ordered by name is more expensive, since it is more work for us to process and less likely to be used for research projects. The RID catalog includes a list of all category codes used, counts of people in all categories, and a list of all names and category labels in the database.
Lois has continued to add new data and corrections and additions to old data since the RID was first released, putting in a massive amount of time each year. Starting with the 1990 release, the RID included a small amount of data from people other than Lois, but most of the data in it is still collected through her efforts. At the beginning of each calendar year, we have established a year's version of the RID and updated the catalog to match it. The first version of the RID in 1988 had 17,151 category code entries in 302 categories for 7,465 people. The current (1992) RID and catalog has 29,837 entries in 305 categories for 14,130 people and events. (Catalog ordering information is on the inside front cover of Kosmos .)
The programs used to maintain and use the RID are still somewhat modified updates of the quick and dirty programs thrown together in 1987 to get the project out of a stall. Those programs are written in BASIC, with no database indexing or other access conveniences since they were not originally designed for much more than data entry and checking. Since the database has almost doubled in size in the meantime, performance of those programs has gone from slow to painfully slow. Since a complete rewrite was obviously needed, in 1991 the ISAR board discussed other changes we might want as well. The result of those discussions is a plan for the next generation: IDEA.
IDEA (International Data Exchange for Astrology) is intended to be an updated and expanded successor to the RID. ISAR will contribute the entire RID, with all the thousands of dollars of data entry and checking and all the unpaid volunteer effort that went into it. The board of NCGR has agreed to join in as a sponsor of IDEA, and their research director, J. Lee Lehman, is writing new programs with a proper database management system. When the IDEA programs are ready, ISAR will also buy a computer for the data to be kept on. Each organization will promote IDEA to its membership, both as a source of already available data and as a central database to which anyone can contribute data. Contributors of at least 1,000 records (timed information for people, places and events), can receive royalties from IDEA if anyone orders from that data. People who don't contribute enough or who don't want royalties can assign their royalty credit to an organization such as ISAR or NCGR. First contributors will always get royalty credit, so don't expect much if your data duplicates what someone else has already done. Corrections will get royalty credit, but discrepancies will have to be resolved with documentation. Anyone interested in more details of the current draft of the IDEA proposal can write to ISAR. Anyone interested in contributing data in machine-readable form should contact Lee for details of what formats she can handle and what the minimum required data is. We hope to publicize IDEA more than we have publicized the RID and draw many more contributors and users. Look for the first release of IDEA in 1993.
Panel discussion in 1994: The International Data Exchange
Panel members: J. Lee Lehman, Mark Pottenger and Lois M. Rodden
JLL: I am Lee Lehman, the NCGR Research Director, so I am the NCGR person today.
LMR: She is the Virgo in the crowd. I am Lois Rodden, the Gemini in the crowd, the Data Base Administrator for IDEA, which stands for The International Data Exchange for Astrology.
MP: I am Mark Pottenger, the Research Director for ISAR, the International Society for Astrological Research. Lee, why don't you give the overview of data exchange?
JLL: Basically, the whole idea of data bases became feasible as computer prices came down, megabytes of hard drives went up, and when you have massive quantities of data, which we discussed this weekend at the research track. The biggest problem people have had in doing astrological research is data collection. It is difficult to get a large sample. ISAR led the way in data collection, and Lois specifically provided some resources to try to get people together. We should also mention that while IDEA is a separate corporation, formed jointly between ISAR and NCGR, it is not the only one in the world with a large data base. There are a couple of European data bases which are also fairly large. We are looking to incorporate all of these into one system. The people working with large data bases, I think, to a large extent, are a friendly group. All understand that the more we share the more we all have. That is one of the real essences of a project like this. Jointly the whole is so much more than the sum of the parts. IDEA is here. Anyone who wants to look up specific things can do so. We have the birth data itself and a system for categorizing it. You could also ask for particular categories of data. So far, the data is personal data. We are now working on integrating event data. Then we are going to work on a system for untimed data. We may have the date or the month but we may not have the full system as we work with it. Some are working on synodic and longer cycles, that then is every bit as good. We are looking to cover the whole world. So the IDEA is über alles. As this goes along, I would certainly not categorize this as being fixed and concrete. We are always looking for ways to improve it; for people to submit data on computer disk, and for ways to disseminate it that way, rather than on plain old-fashioned paper. Basically we are looking for whatever ways we can help astrologers by providing data. That is not to say that people who are doing a certain project are not going to gather their own. At least it would be marvelous to provide data for a pilot study. Then, hopefully as people use our service in terms of getting data, then they expand their data facilities, and then it comes back to us.
LMR: The seed of IDEA began about ten years ago in Mark's living room, when the ISAR Board was meeting. About four or five of us on the Board and a couple of others not on the Board, occasionally came in and offered ideas.
MP: Yes, including Steve Hines of Microcycles and Linden Leisge of Church of Light.
LMR: Yes, Brooke Smith and I think Jim Eshelman offered ideas. Then you, Brooke Smith and I started working on the data base with the questionnaire. The questionnaire grew large enough to cover everybody's freckles. We nicknamed it The Monster. It became so big it was unwieldy. Then in 1985, I made a proposal to ISAR that we do a short form, of perhaps a hundred vocations, maybe a hundred interests. We worked out a short form that came out to about 307 categories. I said, "O.K." we can go ahead, I have all my data on index cards. We can catalog it and start a data bank from which astrologers can access for their studies. We came up to 145 vocational categories, 88 noteworthy, anything from models to lucky winners, 34 human interest, multiple marriages, gays, transsexuals, and 36 relationship categories. It took a while to get that first data bank going.
MP: We had one board member who had volunteered to do the data entry and record keeping program and after waiting a very long time, when nothing was done, I did a throw-together, quick and dirty program so that we could start entering the data. Of course, I had to make minor changes in the program along the way; but up until 1993, that program was used for RID (the Rodden/ISAR Databank). That was the ISAR project. After many years of discussing the feasibility of joining with NCGR in expanding the data base, we eventually made a switch-over in 1993 to programming in a proper data base done by Lee Lehman which is where the change to IDEA comes in. IDEA is not just ISAR. It is ISAR and NCGR as sponsoring organizations. It has the mechanism to include other organizations if they want to come in as a sponsor and have the commitment to give free publicity for IDEA in their publications and to do the things necessary as a sponsor. Organizations can help IDEA grow by promoting it.
LMR: The idea of IDEA is to make it open to the entire astrological community to contribute to, so that there is no politics and no personal ownership of any organization or Rodden or whomever, because eventually the Rodden data will be a segment. There will be NCGR data, the Gauquelins' data, Sara Klein's data, and other individual contributions.
JLL: I think we should also give a plug to UAC, because what greatly facilitated this is that both NCGR and ISAR, as two of the three co-sponsors of UAC, had been working together on UAC since 1986.
LMR: I think one of the greatest strides we are making in astrology is shown by this Del Mar conference where six organizations have combined to make the conference. The fact that ARHAT can be working with us or the Foundation for the Study of Cycles is accessible to us in offering us material, shows how all can gain through more cooperation. To me the ultimate Pluto goal is cooperation, for each to give to the whole for the good of all. Of course, we will always have personalities, because we have egos. Appropriately as an astrologer I noted on February 13, 1988 at 20:07:06 PST in Mark's Los Angeles living room, Mark announced that the RID program was completed, so that historically was the beginning. My contract with ISAR was a little later. I will do the data and code it and you people (referring to Lee and NCGR) will enter it on the computer and fill the orders. By now we have about 30,000 category entries.
MP: Francoise Gauquelin gave us the permission to enter in all of the Gauquelin professional data. We have a large earthquake sample of data to enter into the computer. As Lois mentioned we have been offered data from the Foundation for the Study of Cycles.
JLL: I would say that within six months we should easily have over 100,000 entries.
LMR: One thing that is very important is maintaining quality control. Not only for initially cataloging and categorizing the data, but for keeping updates. As a data collector for the last 25 years I know that corrections are coming into my office every day even after 25 years. I have to put those corrections into the computer, be in touch with IDEA, make sure that all the updates are there: updates on codes and information, who died, who had another 16 marriages. Also there has to be quality control on data sent in to IDEA. Something Lee and I are working on are the Guidelines for Submission of data.
JLL: I think there is another side of this which Lois quite rightly has been harping on and I will add, that when people want to know what they can do for us, one thing they can do is to go to every organization of which they are a member and demand that articles published in the publication cite sources. The source is not The American Book of Charts!
LMR: Exactly! All the Gauquelin material is from birth certificates.
MP: Actually the Gauquelin data is from birth records and not birth certificates. For some information they wrote to the Registrar of Birth and the clerk would write down the information and send it to them.
LMR: Yes I always specify BC for birth certificate and BR for birth record which is not an official document.
MP: We have some hereditary data, but I am not sure we will incorporate it. All the hereditary data is simply numbers: mother #1, mother #2, daughter #1, son #1. Unless there is something giving the names which I am not aware of, we may have to put it all in a special cul de sac, as it were, so as not to confuse the issue.
JLL: This ultimately becomes one of the issues we will have to deal with. As far as how people are ordering the data, the orders are coming in partly by category, and partly for a specific name. There are issues we will have to deal with not only as to how it gets into the data base, but there is also the issue as to how it comes out of the data base. For example, one of the issues we have not resolved about the earthquake data is, are we going to have a cut-off point in terms of lowest number entered on Richter scale. One of the things that happens, is that when you talk to someone researching earthquakes living on the East Coast, they will have a little blip in the New York Times about a 4.1 earthquake that took place in San Francisco yesterday and I am here going chuckle, chuckle, chuckle.
LMR: For the last year and a half in my area we have had 30,000 to 40,000 quakes.
JLL: So the question becomes when is it a big enough quake for our purposes to put in a data base?
MP: Is there anyone here not familiar with the "Rodden Rating?" I will read it off for the purposes of the tape: AA: recorded by State or family from birth record or birth certificate or family bible if it was recorded at birth; A: from the person or associate; B: Biography or autobiography; C: Caution - no verified source; DD: dirty data, unverified quotes.
JLL: There is a variance over time. When you get into 17th Century data or 16th Century data, that may be one sore spot.
LMR: Michelangelo's grandfather wrote, "he was born to my son between the third and fourth hour after sunset". We have the Florentine calendar where the day starts at sunset, and the day we would call it, would be a different day if it was before or after sunset. This again is part of the quality control. There are astrologers who send in data that does not have a location or it has a wrong time zone or a wrong time signature or the wrong latitude and longitude. Everything that comes in must be edited.
MP: To give you an idea of what the data looks like when someone places an order, this transparency which is one of the pages in your handout shows a printout (this was famous people) Rather than ordering it by category, this is saying, "I want Bill Clinton data".
LMR: There is a list of the abbreviations, in case somebody from another country or even this country cannot understand. A few abbreviations are: Q for quotes, Mom, Dad, etc. The purpose is to economically make use of our three lines of text.
MP: Basically you get the full birth data, the Rodden Rating, and any category codes. Some people will have a lot of codes. If there are any relationships with anyone else in the data base, you will get what the relationship is, with the condensed comments. When you are dealing with the dirty data, there will be notes in the birth data section of varieties of time of birth or whatever.
Audience member: I see here that you have only the year of marriage for Hillary Rodham Clinton. How frequently do you update?
LMR: I update the data continually every day but what is released to people in the catalog is updated annually. At the present time I enter the corrections into my computer and then send a Xerox copy of the corrections to Maggie Meister to enter. Then the newest updates will be coming out in IDEA. The catalog will not reflect it for a year.
MP: If you will turn the page of your handout to the Dirty Data on Ronald Reagan there are a lot of relationships there. On the birth data there is DD: with somebody quoting him, but then also 2:00, 3:46, 5:04, 13:53, 14:00, and that is not all of them. I think Lois just got tired.
LMR: Rather than try to give who gave what and what was their idea for the rectification, all of this is dirty data. You sort it out or else write to me if you want the whole two pages of Reagan.
JLL: Incidentally you will notice that all of these printouts come with a creation date and an updated date. Those are now generated by the data base program itself. They were not created on June 24, 1993. That was when I did the final transfer of the data that Mark had sent me. From here on out, when there are changes, it will be the update date. This is part of the data base design at this point but I am in the midst of having the system set up so that orders that people place are captured so that every bit of data that you order we know about. This means, among other things that if you want to come back and say, "Could you send me a list of any updates on this data?" we will have that facility. We will also have the facility, if we have come up with additional data in that category beyond what you ordered six months ago, to allow you to specifically place an order for the differential.
MP: Elizabeth Taylor is an extreme example for the number of categories and relationship entries. You only have four husbands on here.
LMR: Oh my heavens! That needs to be updated.
MP: That is data on famous people. The format is the same when you order things by category. On category codes you do not know who the people are going to be. You simply say, I want the X number of entries in this category and you get whatever people in the data base have that code as one of their codes. Even when you order by category you still get all the information for each person who has the code. On this page of your handout, this was simply the first few entries in the data base for category 3- 170. Guess what category that is? Homosexual males. At first I thought that Lee was playing a joke on me giving me famous people to show how great homosexuals were. Then I realized it is simply the oldest entries that we have that we know were homosexuals were famous people. When I gave this data to Lee it was all in chronological order. The first entries were all famous ones.
AUD: Will people be able to order data of anyone born on June 5th?
JLL: Yes. There are a couple of technical reasons why we did the switch from Mark's program to the data base, beyond the question of Mark being another over-committed person: One of the advantages of data bases is that it is very easy to do searches on any field. What we already have which Mark did not have is an alphabetical index of everybody. If anybody calls, I can instantly tell them what is in the data base. The other thing is for any other piece of information such as a date or year of birth or time of birth or Rodden Rating or city or anything else. You can do searches of combinations of these characteristics up to about 15 different characteristics.
AUD: This is a data base you wrote, it is not commercially available is it?
JLL: Yes it is, it is called FilePro.
AUD: Could I go into CompUSA and buy this data base?
JLL: Yes, I wouldn't necessarily recommend that you do so, however, because the reason that it is in this particular data base, is that as a data base programmer I work daily with my computer clients. It is not necessarily extremely user friendly for a non-database person. Anybody who wants the data in a dBase (TM) format or some of the other common formats, I can put it out that way. I can also put out the data in the form of CCRS and other files for someone who wants it coming out that way.
LMR: Yes, I would like to see these capacities itemized also. We are interacting with astrologers in other countries. Grazia Bordoni in Italy will exchange with us. Dr. Niehenke in Germany will exchange with us.
MP: In terms of getting new data in, anybody who contributes at least 1,000 records of data that are accepted after the editing process can get royalties on any orders. Be they ever so humble.
JLL: You are also welcome to contribute the royalties to IDEA. MP: One of the founding ideas behind IDEA is that it will be priced to cover its own expenses and if it makes any profit, that profit will be in the form of research grants, because it is a non-profit organization. The reason I said: 'after editing' is that there is already a lot of data there. When there is a duplication, then the first person gets the credit. Right now, there is almost no mundane data in there. The huge earthquake file is about to go in. We have 500 UFO sightings that Marcello Borges from Brazil sent in.
Carol Tebbs: Might we address one more thing. In the vision that we had in ISAR when we started with the RID, and now we are in the IDEA phase, could you mention the third phase we would move into?
MP: Right now the program is in shake down, because Lee has been developing the program, shaking things out, adding remaining features and also processing the orders. Eventually this could get to the point where we have several international organizations (Lois already said we have exchange rights) where we have a copy in this country that people could order from; a copy in a European country so that people could order from there, and all of those have the merged contents of all the cooperating organizations who are collectors of data. So this could grow quite a bit.
CAROL: And the third phase we envisioned too, would outgrow the capacity of Lee to fill orders. MP: Yes, then it would go to Astrolabe, or ACS or any other chart service so long as it does not cost them anything and is self-supporting.
JLL: Eventually you have to factor in salary which at this time is not being done. Everyone volunteers his services.
MP: For the purposes of the tape I should give the address for orders. The address is IDEA, c/o ISAR, P.O. Box 38613, Los Angeles, California 90038-0613. Send $3.00 for the catalog and you will receive all of the names in the data base.
LMR: Are we going to charge $4.00 or $5.00 for the catalog?
MP: At the moment we haven't set up something. This is the last catalog of the first phase that we are handing out. Why let paper go to waste.
JLL: Let me also give out two other numbers which should take effect shortly. We are hoping that within the next couple of weeks NCGR has the capability of accepting Master Card and Visa, I hope to have that implemented in another week or two at which point we will be able to take orders by phone and fax. The telephone number for orders, and this is not on your publications, is area code 407/722-9500. Do not be surprised if the line is answered NCGR, because it is NCGR's Conference Information Line. The fax number is 407/728-2244. I think that will do a lot to facilitate international orders, because we can get around currency conversions with Master Card and Visa.
LMR: We are not changing the address yet because I find that the astrological community takes about three years before they find out you have made a change.
MP: Also we do not know who will be processing the orders after Maggie and Lee finish processing. Ordering by name is more expensive. It is more work and we are trying not to be too overcompetitive with others offering data. We do not want to drive anyone else out of the business. Ordering by category is a lot cheaper because one of the main intents is to help researchers. That is ten cents per item for the first one hundred and then it goes down to five cents per item. There are discounts for members of ISAR and NCGR and other groups as they become involved.
CAROL: For people who are in leadership positions or for people who listen to this tape what we are asking of people who come in as sponsors is that they disseminate the information of our data base service in their publications or to friends and researchers.
MP: Yes, publicity is a requirement for a sponsoring organization. Also encouraging them to submit data they have is something we would like.
LMR: Yes. I have a data service which I have had for years where I charge $5.00 per item, when people call me and say they want data right now.
JLL: The phone orders will be a little more expensive. Also we do not want to price compete with other organizations. We want to cooperate.
MP: Also, we do not have guidelines on submissions of data to hand out to you, but anyone desirous of submitting data, could fax it to the number that Lee gave or they can telephone the other number or they can contact Lee at P.O. Box 501107, Malabar, Florida 32950, and we will send submission guidelines, including computer formats, etc.
JLL: If an individual wants to have information downloaded to a mailbox or one of the services and either I or Mark have access to that service, we can supply the data. Also, anybody who likes Procomm and does modem transfers can also talk to me. We will have mundane data for you too.
LMR: The first thing we want to do is to reach the community and say that we have an international databank. Members of local organizations can have group projects such as collecting data.
MP: What we supply is the data and not the chart. We talked about some time in the future to have the data service reside with some chart service so that if someone wanted to order data plus charts, they could do so. That might also give the chart service an incentive for handling IDEA. We do have verbal agreements from the major chart services to do that so long as it does not cost them money. Who knows what the pricing will be after Lee gets finished with the shakedown period.
CAROL: I think a good point has been made that each organization and even chapters within organizations adopt certain data gathering or research projects by seeing what needs to be done and then doing it.
MP: If we do establish a section for untimed data, one project could be getting all of the people in Who's Who, so that they can have the date as a starting point. Then if someone wants to do further research he will at least be able to write and say, what is the date of this person?
JLL: Not only the Who's Who, but all the sports publications. Since birth certificates are hard to get, another project is obtaining death certificates which contains the date of birth. When you come up with results from your research you can send it to KOSMOS or the NCGR JOURNAL. In NCGR we have facilities to help people who want to do statistical research. That is my bailiwick. I have professional training in statistical studies and we will be happy to work with people in the design of their program and in the analysis of results. We encourage them to be written up and published in either journal. The negative results have intrinsic interest as well as replications have intrinsic interest.
MP: By the way, if anyone is interested in mundane data, we have from John Van Zandt a coding system which he has worked up for mundane data which we are trying to work into a form that we can adapt for the IDEA.
LMR: If people want to contribute they can draw data exchange for their contribution; such as Marcello Borges sending in 500 UFO sightings. He drew off certain categories to work on in exchange for his contribution.
Links and References
- This information was provided by Mark Pottenger in an email of 22 March 2009, with the permission to publish on this wiki