mmcnealy: (Default)
[personal profile] mmcnealy

Today was a very good Friday, got everything all wrapped up for the week, homework done and two pies in the oven by 4:30.  Tomorrow Matt has to work, so I am off to a Culinary guild meeting at 1 until whenever, so it should hopefully be a good day of chatting with fun folks and cooking.  Tonight the plan is to stay in, eat fried chicken and mashed potatoes, cherry pie for dessert and play dominos. Just a nice quiet Friday night. <grin>

In case you wanted to read it, here's my pontification for class on "How does the structure of a database affect a searcher’s ability to retrieve documents?"

There are two types of structure to consider when discussing databases and the effects of said structure on retrieval ability, technical structure and database structure.

Technical Structure

The ability to retrieve information and the speed of access is partially determined by the technical structure of the database that is storing the information. Is it using flat files or is it using a standard enterprise level database architecture like Oracle, SQL Server or Sybase? If its using flat files, this is NOT GOOD, as this structure is at least 20 years out of date and has a whole host of issues relating to file locking, data corruption and slow data access due to the need to serially search using string functions through the data. It also is incredibly difficult to maintain flat file “databases” since usually you have to manually make the changes, relational databases are a lot better since if you have to delete a record you can usually set it up to cascade the changes throughout the structure, eliminating the orphan records caused by manual deletes. Please don’t get me started on Excel being used as a “database” either, its one step up from flat text files, but its NOT a database, not even if you want it to be.

The author’s usage of Index File makes me cringe, Index Table would be a better word for the particular structure that they are using, since I should certainly hope that the big commercial databases are using commercial relational databases instead of old COBOL systems and flat files.


Database Structure

A relational database should be organized into tables using the rules of data normalization, more info on the rules along with some great pictures, can be found here,
http://www.datamodel.org/NormalizationRules.html but here are the basic rules from the www.datamodel.org site

1.        Eliminate Repeating Groups - Make a separate table for each set of related attributes, and give each table a primary key.
2.        Eliminate Redundant Data - If an attribute depends on only part of a multi-valued key, remove it to a separate table.
3.        Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.
4.        Isolate Independent Multiple Relationships - No table may contain two or more 1:n or n:m relationships that are not directly related.
5.        Isolate Semantically Related Multiple Relationships - There may be practical constrains on information that justify separating logically related many-to-many relationships.

So what you end up with is a interlocking web of data, linked by key fields and easily searchable, if the fields to search on have been included in the database. For an easily searchable database to happen you need the data and the structure that holds the data to work hand in hand. If the user wants to search for records using a particular field or descriptor, but the designer left it out of the structure, then the user can’t use it the way they wanted. Conversely, its possible for the designer to break the data up so much that the user can’t find what they are looking for either since there are too many places to look. There is a fine line between too many data fields and just the right amount.

Of course this brings me to the next part of the structure, the data access tool, because most users won’t be hitting the database directly, usually their access is restricted to using a tool provided to search the database for them. This is typically designed by a team, a system analyst or a business analyst are typical job titles, who interview the various types of users and find out what they expect the system to do for them. Large shelves of books have been written on this topic of how to do this and there are lots of methodologies out there on the best way to get the requirements for the system out of the user, but the basic approach is iterative.
•        Talk to user
•        Design
•        Show design to users
•        Incorporate feedback into design
•        Show again
•        Repeat above steps until you go insane with new requests from users. Just kidding, usually there is a design period and a cut off date, so from date A to date B its in design and then after that the design is frozen and any future requests go in the next version.
•        Build the darn thing
•        Perform a system test to make sure all the bugs are worked out. Also known as, make sure it works the way you think it should and not any other way
•        Give it to the users to test. As always, the users will find bugs you didn’t think were there, were sure you had fixed or didn’t even know could happen and are un-reproducable, this last one is known as the “Only happens when there's a Blue moon on Friday” kind of bug. If you can reproduce the bug it is fixable, but if you can’t reproduce it, then you really can’t fix it since you don’t know what caused the darn thing to happen in the first place.
•        Fix bugs, give to users to test, repeat as often as necessary, usually for a specific period of time.
•        Release new program.
•        Fix new bugs which the users so very kindly found for you and start adding in the new features that everybody has now decided they need immediately but they never thought of before this.

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

May 2017

S M T W T F S
 1234 56
78910111213
14151617181920
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Powered by Dreamwidth Studios