INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

Update column based on sum value of another column

Update column based on sum value of another column

(OP)
Hello. This is a little difficult to explain....

I have a table with the following table:


CODE

create table mytable1 (counter int, pages int, Folder varchar(100))
insert into mytable1 (counter, pages) values (1, 12)
insert into mytable1 (counter, pages) values (2, 13)
insert into mytable1 (counter, pages) values (3, 6)
insert into mytable1 (counter, pages) values (4, 2)
insert into mytable1 (counter, pages) values (5, 1)
insert into mytable1 (counter, pages) values (6, 9)
insert into mytable1 (counter, pages) values (7, 17)
insert into mytable1 (counter, pages) values (8, 10)
insert into mytable1 (counter, pages) values (9, 11)
insert into mytable1 (counter, pages) values (10, 1) 


I am trying to place no more than 30 units into a folder name.

I would like to take the 'pages' value for row1 and if it is less than 30, update a folder name.

Then I would like to add in row2 pages and if the total for row 1 and 2 are less than 30 then update to the same folder name.

Then I would like to add in row3 pages. If the pages total for rows 1,2,3 exceed 30 then I need to start a new folder name for row 3.


I'm hoping to end up with something like this:

Counter    pages     Folder
1      12      Folder1
2     13     Folder1
3     6     Folder2     (This is a next folder because it would take my sum total for Folder1 over 30)
4     2     Folder2
5     1     Folder2
6     9     Folder2
7     17     Folder3     (This is the next folder because it would take my sum total for Folder2 over 30)
8     10     Folder3
9     11     Folder4     (This is the next folder because it would take my sum total for Folder3 over 30)
10     1     Folder4


I am trying to do this with some while loops, but my I'm putting While scripts within while scripts and I am not getting it correctly.

Any ideas?

RE: Update column based on sum value of another column

Bit more info - depending on the real data and on what you are trying to achieve the best solution may not be the same.

how many rows will you be processing?
Is this a once off or will it be ongoing?
will this be a full process or will it be a delta from previous run?
What are the real rules for ordering the buckets? I know you are using the counter on your example but this may not be the real rule on live data.
is the ordering field always incremented by 1 or will it be random increments?

And what version of SQL Server does this need to run on - depending on it solution may also be different as 2014/2016 have new functionality that may eventually be used.

It would help if you could give us real data and describe the whole process you are doing and what is the ultimate goal of this exercise as we may be able to give a better solution or even suggest alternative ways.

Just as an example that will not work if the input rows are more than 32k

CODE

;with src (counter, pages, sum_pages, group_set) as 
(select top 1  m.counter
              ,m.pages
              ,m.pages as sum_pages
              ,cast(1 as int) as group_set
 from mytable1 m
 union all
 select  m.counter
        ,m.pages
        ,case
         when s1.sum_pages + m.pages > 30 
            then m.pages
         else s1.sum_pages + m.pages
         end as sum_pages
        ,case
         when s1.sum_pages + m.pages > 30 
            then s1.group_set + 1
         else s1.group_set
         end as group_set
 from mytable1 m
 inner join src s1
 on s1.counter + 1 = m.counter
)
select  counter
       ,pages
       ,'Folder' + convert(varchar(20), group_set)
from src
option (maxrecursion 32767) 

Regards

Frederico Fonseca
SysSoft Integrated Ltd
www.syssoft-int.com

FAQ219-2884: How Do I Get Great Answers To my Tek-Tips Questions?
FAQ181-2886: How can I maximize my chances of getting an answer?

RE: Update column based on sum value of another column

(OP)
Hello and thanks for your quick reply and help.

SQL 2008 R2

I will have about 200,000 rows to process. The counter column represents a document number in our database and the pages = the number of pages for that specific document.

The pages sum will = about 1.2 million for all 200,000 rows.

I have a conversion project where I have to place the documents (in counter order) in folders containing 7500-8000 pages. I can't copy a blanket 7500 pages into each folder because most likely the 7500th page will be the middle of a document that will be cut off. I need to have complete documents to end each folder.

So my thought was to sum the pages one row at a time (in order) until I hit 7500, assign a folder, and then start over for the next rows.

The folder names just need to go in order. I could even just assign a value 1,2,3,4,5 and then update them later to an actual 'Folder' + convert(varchar (20), folder) or something like that.

I was going to use the final results to create a quick batch file to run the full copy overnight.

I know there are better programming ways to go about this, but SQL is all I know... And I'm hoping to find a way to not copy/paste 1.2 million images in groups of 7,500.


Live data looks like this:

CODE

create table RecordedDocs (counter int identity(1,1), imageid int, sdocumentno varchar(50), Folder varchar(50), pagenumto int)

Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8312,'0000-000000',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8313,'1882-01274',8)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8314,'1972-01712',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8315,'1976-00013',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8316,'1976-00033',12)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8317,'1976-00047',2)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8318,'1976-00054',19)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5794,'1976-00065',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5795,'1976-00066',2)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5796,'1976-00067',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5797,'1976-00068',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5798,'1976-00069',21)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5799,'1976-00070',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5800,'1976-00085',4)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5801,'1976-00103',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5802,'1976-00104',3)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (5803,'1976-00105',7)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8319,'1976-00148',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8320,'1976-00165',1)
Insert into recordeddocs (imageid, sdocumentno, pagenumto) values (8321,'1976-00166',16) 

Thanks again for your time.

RE: Update column based on sum value of another column

(OP)
This is what I have so far. You can probably imagine how slow this is going.... but it works! Hoping to find a faster way... I am inserting the data one row at a time into another table...

I added a 'Process' column to my original RecordedDocs table.


CODE

create table Tempcount (imageid int, pagenumto int, folder int)



while (select count(counter) from RecordedDocs where process is null) > 0
begin
declare @mincounter int
set @mincounter=(select min(counter) from recordeddocs where process is null)

insert into Tempcount (imageid, pagenumto) select imageid, pagenumto from RecordedDocs where counter = @mincounter

declare @folder int
set @folder = (select isnull(max(folder), 0) from #tempcount)

if (select sum(pagenumto) from tempcount where folder is null) > 7900
update tempcount set folder = @folder + 1 where folder is null

update RecordedDocs set process = 'x' where counter = @mincounter
end 

RE: Update column based on sum value of another column

try the following.
change the 5000 value to your desired value.

Assumption made that sdocumentno is never null - if it is then further coding will be required

CODE

declare @runningtotal int = 0
declare @runningtotal_ver int = 0
declare @groupno int = 1
declare @cutoff int = 5000

if object_id('tempdb..#RecordedDocsSum') is not null
    drop table #RecordedDocsSum;

select min(rd.counter) as counter
      ,rd.sdocumentno
      ,sum(rd.pagenumto) as pagenumto
      , cast(null as int) as runningtotal
      , cast(null as int) as groupno
    into #RecordedDocsSum
from RecordedDocs rd
group by rd.sdocumentno

create clustered index #RecordedDocsSum_ix1 on #RecordedDocsSum
(counter
, sdocumentno
)


update recs
    set @runningtotal = runningtotal = case
        when @runningtotal_ver + recs.pagenumto > @cutoff
            then recs.pagenumto
        else @runningtotal + recs.pagenumto
        end
       ,@groupno = groupno = case
        when @runningtotal_ver + recs.pagenumto > @cutoff
            then @groupno + 1
        else @groupno
        end
       ,@runningtotal_ver = @runningtotal
from #RecordedDocsSum recs
option (maxdop 1)

update recs
    set Folder = 'Folder' + convert(varchar(20), rds.groupno)
from RecordedDocs recs
inner join #RecordedDocsSum rds
on recs.sdocumentno = rds.sdocumentno

-- visually verify that the number of records is not higher than the max
select top 5 rd.Folder
      ,sum(rd.pagenumto) numpages
from RecordedDocs rd
group by rd.Folder
order by numpages desc 

Regards

Frederico Fonseca
SysSoft Integrated Ltd
www.syssoft-int.com

FAQ219-2884: How Do I Get Great Answers To my Tek-Tips Questions?
FAQ181-2886: How can I maximize my chances of getting an answer?

RE: Update column based on sum value of another column

(OP)
Wow that is fantastic! I can't believe how fast it ran. I was able to follow that logic nicely and really appreciate the hours you saved me.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close