database design suggestion
database design suggestion
(OP)
Hi everyone, What would be the best way of representing this relation between Countries, States and Cities.
1-
tblCountries
countID(PK), countName
tblCountries_States
countID, stateID
tblStates
stateID(PK), stateName
tblStates_Cities
stateID, cityID
tblCities
cityID(PK), cityName
OR is this way better
2-
tblCountries
countID(PK), countName
tblSates
stateID(PK), stateName, countID(FK)
tblCities
cityID(PK), cityName, stateID(FK), countID(FK)
Why would one design be preferred to the other?
Thanks
1-
tblCountries
countID(PK), countName
tblCountries_States
countID, stateID
tblStates
stateID(PK), stateName
tblStates_Cities
stateID, cityID
tblCities
cityID(PK), cityName
OR is this way better
2-
tblCountries
countID(PK), countName
tblSates
stateID(PK), stateName, countID(FK)
tblCities
cityID(PK), cityName, stateID(FK), countID(FK)
Why would one design be preferred to the other?
Thanks
RE: database design suggestion
1 isn't wrong, but it is over-designed, as it would allow a state to belong to more than one country, or a city to belong to more than one state
r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
RE: database design suggestion
RE: database design suggestion
r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
RE: database design suggestion
If somebody asked for all the cities in state a (id 23) in country b (id 76), what does the query look like in layout 1 vs layout 2?
In 2 I think it is
select cityid, cityname
from tblcities
where stateid=23 and countid=76
I would have a difficult time writing the same query in 1.
I think 2 is the easier query for this particular question. It may depend on what type of questions you think you will need to answer.
RE: database design suggestion
if not, then i would completely discard option 1.
r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
RE: database design suggestion
You have hierarchical data, because any one city can only be in one state, which can only be in one country.
Therefore the city table only needs a foreign key to state, thereby automatically making the country information available. This elegantly and correctly handles the two towns called Columbus, the two towns called Washington, one of which is in England, and the other in that strange land 3,500 miles off England's West coast, er, I think it's called America.
I would also recommend losing the "tbl" prefix from the table names, and the needless abbreviations. Everything in a database from which one selects data, can be reasonably assumed to be a table, unless specified otherwise, e.g. V_COUNTRY would be a view about countries.
You also need to constrain your data to be valid, e.g. no null country, state or town names. No leading or trailing white spaces, and initial capitals for all names (because a country is a proper noun).
As an example, try something like
CODE
(
COUNTRY_ID INTEGER,
COUNTRY_NAME VARCHAR2(100)
);
ALTER TABLE COUNTRY ADD CONSTRAINT PK_COUNTRY PRIMARY KEY (COUNTRY_ID);
ALTER TABLE COUNTRY ADD CONSTRAINT NN_COUNTRY_NAME CHECK (COUNTRY_NAME IS NOT NULL);
ALTER TABLE COUNTRY ADD CONSTRAINT UQ_COUNTRY_NAME UNIQUE(COUNTRY_NAME);
ALTER TABLE COUNTRY ADD CONSTRAINT PROPER_NOUN_COUNTRY_NAMES CHECK (COUNTRY_NAME = INITCAP(COUNTRY_NAME));
CREATE TABLE STATE
(
STATE_ID INTEGER,
STATE_NAME VARCHAR2(100),
COUNTRY_ID INTEGER
);
ALTER TABLE STATE ADD CONSTRAINT PK_STATE PRIMARY KEY (STATE_ID);
ALTER TABLE STATE ADD CONSTRAINT NN_STATE_NAME CHECK (STATE_NAME IS NOT NULL);
ALTER TABLE STATE ADD CONSTRAINT PROPER_NOUN_STATE_NAMES CHECK (STATE_NAME = INITCAP(STATE_NAME));
ALTER TABLE STATE ADD CONSTRAINT STATE_MUST_BE_IN_A_COUNTRY CHECK (COUNTRY_ID IS NOT NULL);
ALTER TABLE STATE ADD CONSTRAINT FK_STATE_COUNTRY FOREIGN KEY (COUNTRY_ID)
REFERENCES COUNTRY(COUNTRY_ID);
Resist the temptation to use the field name of "NAME" instead of COUNTRY_NAME or STATE_NAME as the word NAME on its own is a key or reserved word in Oracle (where I did the above example). For this reason I suggest you use a colour context sensitive editor to do your create table statements, so that you automatically are made aware of any such words.
I suspect that you are using Access, in which case this may well be irrelevant, as I believe it provides a designer form for the purpose.
Regards
T
RE: database design suggestion
RE: database design suggestion
I'm developing with PHP and I'll be validating what the user will be entering either client side or server side. I won't let MySQL do the validation.
RE: database design suggestion
after all, your app is not the only way of getting data into the database, and you want to be sure the other routes don't introduce inconsistencies
r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
RE: database design suggestion
if you don't constrain the data to be valid in the database, I hereby guarantee you that you will get corrupt and/or invalid data into your system.
On what basis of reason and logic can you guarantee that the application will the the only thing ever to access the data? Nobody will ever start a query editor, link to the db and run SQL? No technicians will ever go in directly to perform routine maintenance? No changes will ever take place requiring new fields, and data synthesis?
You MUST constrain the data. If the application does it too, that's fine.
Regards
T
RE: database design suggestion
RE: database design suggestion
r937.com | rudy.ca
Buy my new book Simply SQL from Amazon
RE: database design suggestion
The usual twaddle spouted is that the application should be "database independent" and therefore it must do all its own checking, so that it will work on oracle, sql server, mysql or any other RDBMS.
Also, the user experience of manually entering 20 fields on a form, only to be told at the end that there's something wrong, is very poor. It also consumes network resources to do round trips to the db, every time a field is populated. In this case, the front end form should do validation as-you-go, and then the db checking provides belt and braces. This only applies where there is manual data entry. For everything else, there is no user-editable form to worry about, so the db does the validation.
If the db does checking, and the form writer goofs, no problem, but the reverse is not true. The db must always be the final arbiter of truth and enforcer ofdata integrity.
Also, constrained fields can (in oracle at least) be detected by the optimiser, and where appropriate optimised out of queries, thereby improving performance.
This is trivial however, the main thing is to always do integrity and validation in the db, also do it in forms that users use to make edits, and nowhere else in the front end.
Regards
T
RE: database design suggestion
But it's not only an unneccessary field, it's even dangerous, as it can introduce inconsistencies like a city belonging to a state, but also to a country, to which that state does not belong to.
The query BigRed wrote on the model2 would make it easy to query data without joining other tables, but the data integrity is worth more than ease or performance of queries.
You would rather do
CODE
from tblcities
left join tblstates on tblstates.stateid = tblcities.stateid
where tblcities.stateid=23 and tblstates.countid=76
And even though filtering for country 76 would be unneccessary, if state 23 is part of that country, it would double check that and only yield results, if that is the case.
Another thought I'd add is, that you should not abbreviate country with count, the field name suggests it's some kind of count, even though the suffix id suggests it's some key field, it could be a key to some count table.
Like with any code or variables, rather be verbose. It's intellisense, that should shorten your writing, not abbreviations. you want to be able to get back to code written years ago by yourself or - even worse - by others and be able to understand it.
Bye, Olaf.
RE: database design suggestion
City -> Columbus
State -> Ohio
Country -> USA
City -> Columbus
State -> Wisconsin
Country -> USA
City -> Ottawa
State -> Ontario
Country -> Canada
City -> Ottawa
State -> Illinois
Country -> USA
City -> Madrid
State -> N/A
Country -> Spain
City -> Tokyo
State -> N/A
Country -> China
etc...
The state is not always required. It is when it makes sense to North American clients (states and provinces). A client could choose to go to England. What will be important is the country and the city. Another client could choose to go to Canada and then Ontario and then choose a city in Ontario.
This project is about choosing a college institution by country, by states (if available) or by city. Then the table institutions would be something like this :
tblInstitutions (I've added the tbl to make a distinction with a field)
institutionName
cityID (FK)
stateID(FK) (could be a N/A field)
countryID(FK)
And the other tables (countries, states, cities)
tblCountries
countID(PK), countName
tblSates
stateID(PK), stateName, countID(FK)
tblCities
cityID(PK), cityName, stateID(FK), countID(FK)
Considering this extra inforamtion, is this the best way of representing the relations between the tables?
RE: database design suggestion
State (for USA),
province for Canada (I believe),
territory (for antipodeans)
County (for the UK),
Canton for Switzerland,
Departement for France
etc.
You should also have a 'not applicable' entry, e.g. for Liechtenstein and Luxemboug.
Your design needs a country_region table as each region belongs to 1 and only one country. Each city may belong to zero or 1 geographic regions, but by FK to the region, one can obtain the country. Luxembourg for example would have a region type of "Not applicable" but there would still be an entry in the geographic region table for luxembourg, which would just be a place holder for the foreign key to country.
Regards
T
RE: database design suggestion
RE: database design suggestion
In regard to the problem of countries with no state I'd also recommend a placeholder like "not applicable". In a strict hierarchical structure of countries->states->cities you can't have a NULL for the stateID in the cities table, as there is no countryID in the cities table, therefore you'll have that single replacement for the missing states to be able to connect to the country.
It's not a good idea to let cities have a countryID for the countries in which there are no states just to spare one record. This will introduce the possibility of redundancy and contradictionary relations, which a database model should prevent by it's design alone, not with additional rules like you could set and apply: Only one of the ID fields state or country of a cities table having both foreign keys should be set to a valid ID and the other remain NULL.
That's a way to save one record for countries with no states or similar hierarchical entities. For the cost of rather complex SQL you'd later need to query data.
Bye, Olaf.
RE: database design suggestion
Beacause they were written by application programmers and not database specialists.
But trust us, any database specialist will tell you of the problems caused (that are VERY hard to fix) by this poor practice of enforcing things only in the application. The database will probably outlive the application by many years, this stuff must be enforced in the database.
"NOTHING is more important in a database than integrity." ESquared
RE: database design suggestion
-------------------------
The trouble with doing something right the first time is that nobody appreciates how difficult it was - Steven Wright