A good Friday
Jan. 16th, 2004 05:08 pmToday was a very good Friday, got everything all wrapped up for the week, homework done and two pies in the oven by 4:30. Tomorrow Matt has to work, so I am off to a Culinary guild meeting at 1 until whenever, so it should hopefully be a good day of chatting with fun folks and cooking. Tonight the plan is to stay in, eat fried chicken and mashed potatoes, cherry pie for dessert and play dominos. Just a nice quiet Friday night. <grin>
In case you wanted to read it, here's my pontification for class on "How does the structure of a database affect a searcher’s ability to retrieve documents?"
There are two types of structure to consider when discussing databases and the effects of said structure on retrieval ability, technical structure and database structure.
Technical Structure
The ability to retrieve information and the speed of access is partially determined by the technical structure of the database that is storing the information. Is it using flat files or is it using a standard enterprise level database architecture like Oracle, SQL Server or Sybase? If its using flat files, this is NOT GOOD, as this structure is at least 20 years out of date and has a whole host of issues relating to file locking, data corruption and slow data access due to the need to serially search using string functions through the data. It also is incredibly difficult to maintain flat file “databases” since usually you have to manually make the changes, relational databases are a lot better since if you have to delete a record you can usually set it up to cascade the changes throughout the structure, eliminating the orphan records caused by manual deletes. Please don’t get me started on Excel being used as a “database” either, its one step up from flat text files, but its NOT a database, not even if you want it to be.
The author’s usage of Index File makes me cringe, Index Table would be a better word for the particular structure that they are using, since I should certainly hope that the big commercial databases are using commercial relational databases instead of old COBOL systems and flat files.
Database Structure
A relational database should be organized into tables using the rules of data normalization, more info on the rules along with some great pictures, can be found here, http://www.datamodel.org/NormalizationRules.html but here are the basic rules from the www.datamodel.org site
1. Eliminate Repeating Groups - Make a separate table for each set of related attributes, and give each table a primary key.
2. Eliminate Redundant Data - If an attribute depends on only part of a multi-valued key, remove it to a separate table.
3. Eliminate Columns Not Dependent On Key - If attributes do not contribute to a description of the key, remove them to a separate table.
4. Isolate Independent Multiple Relationships - No table may contain two or more 1:n or n:m relationships that are not directly related.
5. Isolate Semantically Related Multiple Relationships - There may be practical constrains on information that justify separating logically related many-to-many relationships.
So what you end up with is a interlocking web of data, linked by key fields and easily searchable, if the fields to search on have been included in the database. For an easily searchable database to happen you need the data and the structure that holds the data to work hand in hand. If the user wants to search for records using a particular field or descriptor, but the designer left it out of the structure, then the user can’t use it the way they wanted. Conversely, its possible for the designer to break the data up so much that the user can’t find what they are looking for either since there are too many places to look. There is a fine line between too many data fields and just the right amount.
Of course this brings me to the next part of the structure, the data access tool, because most users won’t be hitting the database directly, usually their access is restricted to using a tool provided to search the database for them. This is typically designed by a team, a system analyst or a business analyst are typical job titles, who interview the various types of users and find out what they expect the system to do for them. Large shelves of books have been written on this topic of how to do this and there are lots of methodologies out there on the best way to get the requirements for the system out of the user, but the basic approach is iterative.
• Talk to user
• Design
• Show design to users
• Incorporate feedback into design
• Show again
• Repeat above steps until you go insane with new requests from users. Just kidding, usually there is a design period and a cut off date, so from date A to date B its in design and then after that the design is frozen and any future requests go in the next version.
• Build the darn thing
• Perform a system test to make sure all the bugs are worked out. Also known as, make sure it works the way you think it should and not any other way
• Give it to the users to test. As always, the users will find bugs you didn’t think were there, were sure you had fixed or didn’t even know could happen and are un-reproducable, this last one is known as the “Only happens when there's a Blue moon on Friday” kind of bug. If you can reproduce the bug it is fixable, but if you can’t reproduce it, then you really can’t fix it since you don’t know what caused the darn thing to happen in the first place.
• Fix bugs, give to users to test, repeat as often as necessary, usually for a specific period of time.
• Release new program.
• Fix new bugs which the users so very kindly found for you and start adding in the new features that everybody has now decided they need immediately but they never thought of before this.
(no subject)
Date: 2004-01-16 06:07 pm (UTC)Are you going to Ursulmas next week?
(no subject)
Date: 2004-01-16 08:56 pm (UTC)(no subject)
Date: 2004-01-16 11:13 pm (UTC)(no subject)
Date: 2004-01-17 09:21 am (UTC)(no subject)
Date: 2004-01-17 05:39 pm (UTC)Now if only I could talk Jason into going...
(no subject)
Date: 2004-01-16 08:13 pm (UTC)You are one of those data base people who never give me a straight answer at work! Okay, so they are *not all* that bad, but some of them have no clue how to speak english. You'd think being an engineer would help, but it doesn't.
(no subject)
Date: 2004-01-17 09:08 am (UTC)(no subject)
Date: 2004-01-17 10:13 am (UTC)I've found I've got enough of a programing background to make sense on certain things. Like I discovered a flaw the other day in a report a DBA would generate for me, so I wrote a new script to remedy it, which he then translated into the language he needed. But the innerworkings and such of DB still baffle me at times.
(no subject)
Date: 2004-01-16 08:58 pm (UTC)(no subject)
Date: 2004-01-17 03:29 am (UTC)first the data,
then the search criteria (ways you will use the data),
which creates the navigation,
which controls the design of the interface.
If you have to change the data, or the search criteria that will change the gui (graphic user interface - for those reading this who are not geeks)
"making it pretty", adding graphics - comes last. And that's where a lot of companies/applications fall down, as they go for the look and feel first and then bend the data to fit it. If you want a case study to prove that point Scient.com, Boo.com and Google to prove the data comes first rule. Goggle aint pretty, but is now king because the users can find what they need within the two click rule.
:)
(no subject)
Date: 2004-01-17 09:18 am (UTC)Right, those are the rules for the system design. This was getting to around 2+ pages at the end I decided that it was probably time to wrap it up, so I didn't go into the whole detail of how the system was designed from scratch. I'm in this class with a whole bunch of English majors and other Humanities majors who have never seen the insides of a system before and don't know how they work.
(no subject)
Date: 2004-01-17 10:04 am (UTC)sweetie it's what I do (Project manager/Tech-Lead Developer, now with my own company. We specialise in fixing and rebuilding failed MS sytems to linux.) - for the big boys. Most and i do mean most servers are linux - world wide. "established commecial circles" I'd go back and recheck my facts. I make the backends for these people and they are running away from microsoft in droves. One of the largest, if not the largest databases serving live to the web on the planet is Linux/SQL. IBM are now equally pushing it's ready off the shelf linux computers equally with MS.
And in my experince(since 96) those who opt for the very expensive Microsoft .NET route - have a life span of three years (most 1-2 as long as the VC lasts). Tops! A compnay may have an ASP/VB system in it's offices, but the main power willbe deilvered via linux/unix with SQL/MySQL. Or it falls over. Alot!
If you look at my friends list, they are some of the biggest heavy hitter Developers and they are ALL Unix/Linux. Their employers are, the big telephone companies, oil conglomerates, banks, human genome project, largest ISPs, international ecology groups... it goes on allllll linux/unix.
(no subject)
Date: 2004-01-17 10:09 am (UTC)(no subject)
Date: 2004-01-17 11:42 am (UTC)OK, so I will be the first to admit that I know hardly anything about Linux, since the idea of customizing ones OS scares me stupid, I prefer nice already set up ones that you don't have to customize and potentially screw up the set up on and that come with good GUI's, no command lines and standard applications so you don't have conversion issues if you try to send someone a file or read a file someone sent you.
I really don't mind paying for software that works and if it breaks I can get it fixed easily. My younger brother is the Linux-head of the family and his computer is constantly having issues.
I knew that this is what you do, I was speaking from my own experience and from the industries that I am familiar with, which is banking and insurance here in the US, which isn't into taking big risks. For heavens sake, most of them are still running on big mainframes and just moved to PC's about 9 years ago.
Is SQL MySQL's bigger brother or is it standing for SQL Server?
(no subject)
Date: 2004-01-18 01:40 am (UTC)First there was SeQuil, which was huge and ran like a pig (not babe), it's complicatedness and extensiveness (a function for every function), was then trimmed down and re-enginered as two different languages, SQL and Squirrel. Both still ran like pigs, but more like babe. But, lo unto the humble geek there was a cry! I want an opensource FREE (without the chains of licences or the insistance on software to write my queires!) language, that is tailorable to what I want it to do without having to have all the extensiveness! And thusly was the greased piggy of MySQL delivered unto the greatful developer!
I have worked in banks where we used C/C++ on unix that fell over a lot, then moved to MySQL/Perl/linux and was exceptionally happy. Partially it is because Europe is moving faster to have online access and have had to migrate everything to keep up with each other. Barclays origionally used total MS/Oracle in their frist gen online total system (banking by phone in 2000) with a heavy reliance on the MS end and it was a disater. They then did a half and half and solved their security problems, but it was very embarassing there for a while and near killed the bank.
If you want a real view of what most comapanies (these will be medium to large enterprises as buying your own server and rack are not necessisarly cost effective if you require 24hr support) are using go look at the professional hosting companies. Very few are now offering ASP/VB/SQL as a db option and all of the MS options are twice to 100 times as expensive as PHP/Perl/MySQL. Most now only offer ASP with MySQL. the reasons being (in no particular order)
1. the license fees compaines no longer want to pay
2. the security issuse associated with running MS - so easy to have a denial of service and to be hacked!
3. the support required by the hosting company to constant fix, patch and backup MS - not cost effective
4. and for those who don't actually require the space of their own rack, MySQL is fine.
look at:
www.dsvr.co.uk
http://www.redstation.com/