Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL Puzzle 14: Puzzle Lite #2 on 'roids 5

Status
Not open for further replies.

vongrunt

Programmer
Mar 8, 2004
4,863
HR
Take data from SQL Puzzle Lite #2 (see thread183-1197121).

Make result set containing 48 rows and three columns: N (tinyint), State and Capital

1st row (N=2): state/capital closest to two other capitals
2nd row (N=3): state/capital closest to three other capitals
...
repeat until N=49

Criteria "closest" is based on sum of distances. For N=2 example, every city has two nearest cities - but (only) one has minimal sum of distances to nearest two cities and that is... Boston, I think.

Include sum in result set if you want.

Best exec time (measured w/ profiler) wins it all.

Plz no cheating - especially about Hawaii :p

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
These are the results of the belgian jury.

9 seconds on my system. A numbers table was allowed?

Code:
2	Boston, Massachusetts, 107.472
3	Boston, Massachusetts, 201.493
4	Providence, Rhode Island, 347.197
5	Concord, New Hampshire, 478.224
6	Concord, New Hampshire, 608.66
7	Concord, New Hampshire, 870.066
8	Hartford, Connecticut, 1141.74
9	Hartford, Connecticut, 1384.26
10	Hartford, Connecticut, 1668.28
11	Hartford, Connecticut, 2058.31
12	Hartford, Connecticut, 2581.87
13	Trenton, New Jersey, 3035.65
14	Trenton, New Jersey, 3468.66
15	Trenton, New Jersey, 3994.52
16	Trenton, New Jersey, 4551.26
17	Harrisburg, Pennsylvania, 5050.77
18	Harrisburg, Pennsylvania, 5542.58
19	Harrisburg, Pennsylvania, 6135.63
20	Harrisburg, Pennsylvania, 6743.95
21	Harrisburg, Pennsylvania, 7418.72
22	Harrisburg, Pennsylvania, 8095.69
23	Harrisburg, Pennsylvania, 8851.93
24	Harrisburg, Pennsylvania, 9647.76
25	Charleston, West Virginia, 10348.3
26	Charleston, West Virginia, 10991
27	Charleston, West Virginia, 11665.8
28	Charleston, West Virginia, 12406.5
29	Charleston, West Virginia, 13155.7
30	Charleston, West Virginia, 13922.7
31	Charleston, West Virginia, 14694.1
32	Charleston, West Virginia, 15523.8
33	Columbus, Ohio, 16421.9
34	Columbus, Ohio, 17351.8
35	Frankfort, Kentucky, 18302
36	Frankfort, Kentucky, 19301.3
37	Frankfort, Kentucky, 20369.3
38	Frankfort, Kentucky, 21449.1
39	Frankfort, Kentucky, 22626.6
40	Indianapolis, Indiana, 23990.1
41	Indianapolis, Indiana, 25388
42	Indianapolis, Indiana, 26900.6
43	Indianapolis, Indiana, 28479.8
44	Indianapolis, Indiana, 30274.4
45	Indianapolis, Indiana, 32178.9
46	Springfield, Illinois, 33918
47	Springfield, Illinois, 35661.6
48	Springfield, Illinois, 38065.9
49	Springfield, Illinois, 42763.9

Code:
[COLOR=white]
CREATE TABLE #number (number int PRIMARY KEY)
DECLARE @intLoopCounter INT
SELECT @intLoopCounter =2
WHILE @intLoopCounter <=49 BEGIN
INSERT INTO #Number
VALUES (@intLoopCounter)
SELECT @intLoopCounter = @intLoopCounter +1
END
GO 

SELECT 	s1.Capital
	, s1.State 
	, dbo.fnCalculateDistance(s1.longitude,s1.latitude,s2.longitude,s2.latitude) AS Distance
INTO #temp
FROM 	StateCapitals s1, Statecapitals s2
WHERE 	dbo.fnCalculateDistance(s1.longitude,s1.latitude,s2.longitude,s2.latitude) > 0
GO

SELECT 	number
	, capital 
		= (	SELECT TOP 1 a1.capital + ', ' + a1.state + ', ' + convert(varchar(20),sum(distance)) sumdistance
			FROM #temp as a1 
WHERE (SELECT count(*) FROM #temp a2 
       WHERE a2.distance <= a1.distance and 
      a1.capital = a2.capital) <= n1.number
GROUP BY a1.capital, a1.state
ORDER by sum(distance))
FROM #number n1
GO

DROP TABLE #temp
DROP TABLE #number
GO
[/color white]

Christiaan Baes
Belgium

"My new site" - Me
 
Results are OK... IMHO.

Exec time (AMD64/3200): 4.030, 4.110, 4.033

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
Anybody else?

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
I will try, but I wont have chance before the weekend. I doubt that any solution I come up with will compare speed-wise, it will almost certainly be procedural.

[vampire][bat]
 
> I doubt that any solution I come up with will compare speed-wise, it will almost certainly be procedural.

FYI there is no need to go procedural (hints: ranking, running total).

And plz - separated columns for State/Capital and eventually sumDistance, as explained in OP.

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
> So mine is not a good solution?

I won't nitpick that much... every time new puzzle comes out we always find 2-3 shortcuts thru hyperspace [smile].

Your code gives good results. Let's see will someone come up with something faster.

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
I can't see a non-procedural solution, so I'll sit this one out and hopefully learn from the results.

[vampire][bat]
 
earthandfire, I urge you to do this.

I did this in a quasi procedural way and got very fast results. In fact, my output is identical to chrissie's, but on my computer, only takes 0.4 seconds [small](which probably means an hour or 2 on vongrunts)[/small]. I'm going to 'work it a little' because I can probably shave a little more time off.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
If you are taking a procedural approach, George, then I will definitely have a go. Probably not before weekend though.

I'm so used to procedural programming, SQL is just alien to me and I'm finding it an uphill struggle - challenging, but interesting - which is why I think these puzzles are great.

[vampire][bat]
 
I shaved 25% off the execution time. I'm satisfied. Time for bed. As soon as Vongrunt decides when we should post results, I will. I'm getting consistent results at 0.3 seconds. Intel Pentium 4 Processor With Hyperthreading Technology 2.8 GHz, 1 GB Ram.


-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
> I'm so used to procedural programming, SQL is just alien to me and I'm finding it an uphill struggle - challenging, but interesting

And there are days when I do procedural programming and forget If/For statements ever exist :)

True, SQL demands more "conquer&divide" approach - how to make as much as possible with fewer statements, then transparently take care about exceptions.


> I'm getting consistent results at 0.3 seconds.

Very good IMHO. Post it.

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
Here ya go...

Code:
Alter Procedure Puzzle14OnRoids_gmmastros
AS
SET NOCOUNT ON

Declare @Data 
Table   (
        Capital VarChar(20) Primary Key Clustered, 
        Longitude Decimal(8,5), 
        Latitude Decimal(8,5)
        )

Insert Into @Data Select Capital, Longitude, Latitude From StateCapitals

Declare @Temp
Table   (
        RowId Integer Identity(1,1),
        FromCapital VarChar(100),
        ToCapital VarChar(100),
        Distance Float,
        Rank Integer
        Primary Key Clustered (FromCapital, ToCapital)
        )

Insert
Into    @Temp
        (
        FromCapital,
        ToCapital,
        Distance
        )
Select  A.Capital,
        B.Capital,
        dbo.FnCalculateDistance(A.Longitude, A.Latitude, B.Longitude, B.Latitude) As Distance
From    @Data A 
        Inner Join @Data B 
           On A.Capital <> B.Capital
Order By A.Capital, 
        dbo.FnCalculateDistance(A.Longitude, A.Latitude, B.Longitude, B.Latitude) 

Update  T
Set     T.Rank = T.RowId - A.MinRowId + 1
From    @Temp T
        Inner Join 
        (
        Select FromCapital,
               Min(RowId) As MinRowId
        From   @Temp
        Group By FromCapital
        ) A On T.FromCapital = A.FromCapital

Declare @Out
Table   (
        Rank Integer,
        FromCapital VarChar(20),
        Distance Float
        )

Declare @i Integer
Set @i = 2

While @i <= 49
    Begin
        Insert
        Into   @Out
               (
               Rank,
               FromCapital,
               Distance
               )
        Select Top 1
               @i,
               FromCapital,
               Sum(Distance)
        From   @Temp
        Where  Rank <= @i
        Group By FromCapital
        Order By Sum(Distance)

        Set @i = @i + 1
    End

Select Rank, 
       State, 
       FromCapital As Capital, 
       Convert(Decimal(10,3), Distance) As Distance
From   @Out O
       Inner join StateCapitals
          On O.FromCapital = StateCapitals.Capital
Order By O.Rank

go

Declare @S DateTime
Set @S = GetDate()

exec Puzzle14OnRoids_gmmastros
Select DateDiff(Millisecond, @s, GetDate())

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Exec times for yer sproc:

Xeon 3.06: 0,450, 0,382, 0,436
AMD64/3200: 1.210, 0.236, 0.236, 1.230, 0.230

Weird thing about later results: same happens to my code - and only on that machine.

Anyway, here is my weirdo. 2nd query is kinda interesting:
Code:
create procedure Puzzle14OnRoids_vongrunt
as
set nocount on

-- calculate all distances, perform ordered insert
select identity(int, 1, 1) as foo, 	convert(tinyint, null) as subRank, State, Capital, Distance
into #blah
from
(	select top 100 percent
		SC.State, SC.Capital,
		dbo.fnCalculateDistance( SC.Longitude, SC.Latitude, SC2.Longitude, SC2.Latitude ) as Distance
	from StateCapitals SC 
	inner join StateCapitals SC2 on SC.State <> SC2.state
	order by SC.State, Distance
) AllDistances

-- get ranks and running total values per (State) partition sorted by (Distance)
declare @runningSum float; set @runningSum = 0
declare @State varchar(20)
update #blah
	set @runningSum = Distance = @runningSum + case when State <> @State then Distance-@runningSum else Distance end, 
	@State = State,
	subrank = 1 + (foo-1)%49

-- find min. values per rank, join to get State/Capital
select A.subRank as N, A.Capital+', '+State as StateCapital, convert(decimal(7, 2), A.Distance) as minSumDistance
from #blah A
inner join
(	select subRank, min(Distance) as minSumDistance
	from #blah
	where subRank between 2 and 49
	group by subRank
) B on A.subRank=B.subRank and A.Distance = B.minSumDistance

drop table #blah

go

Declare @S DateTime
Set @S = GetDate()

exec Puzzle14OnRoids_vongrunt
Select DateDiff(Millisecond, @s, GetDate())

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
vongrunt,

Your sp is consistently running at 0.17 seconds on my computer. Nice job. That's about 1/2 the time of my procedure.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Weird. Somebody told me that variable tables should be faster then temp tables. So I changed mine and guess what it took 2m41 to complete

Code:
SET NOCOUNT ON

DECLARE @number TABLE (number int PRIMARY KEY)
DECLARE @intLoopCounter INT
SELECT @intLoopCounter =2
WHILE @intLoopCounter <=49 BEGIN
	INSERT INTO @Number
	VALUES (@intLoopCounter)
	SELECT @intLoopCounter = @intLoopCounter +1
END

Declare @temp
Table   (
        Capital VarChar(20) 
        , state varchar(20) 
        , distance float
        )

INSERT INTO @temp (capital,state,distance)
(	SELECT 	s1.Capital
		, s1.State 
		, dbo.fnCalculateDistance(s1.longitude,s1.latitude,s2.longitude,s2.latitude) AS Distance
	FROM 	StateCapitals s1, Statecapitals s2
	WHERE 	dbo.fnCalculateDistance(s1.longitude,s1.latitude,s2.longitude,s2.latitude) > 0
)
select * from @temp
select * from @number

SELECT 	number
	, capital 
		= (	SELECT TOP 1 a1.capital + ', ' + a1.state + ', ' + convert(varchar(20),sum(distance)) sumdistance
			FROM @temp as a1 
WHERE (SELECT count(*) FROM @temp a2 
       WHERE a2.distance <= a1.distance and 
      a1.capital = a2.capital) <= n1.number
GROUP BY a1.capital, a1.state
ORDER by sum(distance))
FROM @number n1
GO

SET NOCOUNT OFF

Christiaan Baes
Belgium

"My new site" - Me
 
> Your sp is consistently running at 0.17 seconds on my computer.

I'm not sure that would be possible without dirty pass-through-assignment hack used in 2nd query (it relies on internal sort of data, but does everything in one pass - linear time).

> Weird. Somebody told me that variable tables should be faster then temp tables.

Common urban myth - RAM is faster than disk and blah. Operations that require frequent "rewinds" (a la self/theta joins, TOP N subqueries etc) are often much slower with table @variables.

------
Theory: everybody knows everything, nothing works
Practice: everything works, nobody knows why

[banghead]
 
Chrissie,

Don't give up on table variables. Like vongrunt says, they are not always faster. In my experience, they are usually faster, but not always.

I was playing with your code a little, and found something interesting...

Replace:
[tt]Declare @temp
Table (
Capital VarChar(20)
, state varchar(20)
, distance float
)[/tt]

With:

Code:
Declare @temp
Table   (
        Capital VarChar(20) 
        , state varchar(20) 
        , distance float
        [!]Primary Key Clustered (Capital, State, Distance)[/!]
        )

Compare the execution time. I think you'll be impressed.

-George

Strong and bitter words indicate a weak cause. - Fortune cookie wisdom
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top