Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Shaun E on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

parsing URLs for base URL 1

Status
Not open for further replies.

dpimental

Programmer
Jul 23, 2002
535
US
I have done a series of internet searches, and have saved the results. The result set is a list of urls associated with keywords used in the searches. I would like to parse / search through the full url for the ....

1. Base url (e.g.
2. The Site Type (i.e., .com, .net, .org, .edu, ect).

3. How many times each base url occurs.

David Pimental
(US, Oh)
dpimental@juno.com
 
You gave us very little to work with, not even an actual question. Please read the FAQ in my signature line. Take the time to describe your desired solution in technical terms. If you force us to dig information out of you, it will waste a lot of time, and we'll wind up asking a lot of questions we don't ultimately need the answers to. It will also risk the solution going askew, because unlike you, we aren't focusing on your problem alone, and we'll forget something you said two days earlier.

When you don't provide enough information, it makes us work harder because we have to imagine all the various possibilities that include what you have told us, and solve each one. In this case, you provided so little information that by my estimate there are dozens, if not hundreds, of possibilities. Alternatively, we can pick one set of possibilities and give a solution for that, but then it's likely you'll come back with "No, no, I wanted..." and our effort was wasted.

I realize putting all the technical information together is more work for you, but it saves time in the long run. And after all, it's you who is asking for free help.

Rick Sprague
Want the best answers? See faq181-2886
To write a program from scratch, first create the universe. - Paraphrased from Albert Einstein
 
I thought that I was clear enough; but here goes.

I have a database table called "urls".

I have deposited several thousand urls.
They begin with "They have been copied from browsers to ms excel to access.

All that I want to do is parse through each url, and find the base url, which will be formatted " and so on.
They will be stored in a field called "baseurl".

I also would like the type of website (.com, .net, ect), which will be stored in a field called "urltype".

I would also like to query this table and count the number of times each base url occurs, and would like to use that information in a report. This would be a separate item. I can do the report; but can't see to get it to order by the largest number of occurences of each baseurl. But maybe this part of my topic would be better served in a different forum (i.e. one on queries or reports). Let me know if this is enough.
 
Sorry for any confusion on the above note. I am on a machine that two of use use for tech-tips.

Some updating is needed.

I have figured out the query report piece.

I have found the following to use to return the base url ...
Left(,InStr(9,[url],"/")) So, thanks anyways for your help. David Pimental (US, Oh) dpimental@juno.com
 
The baseurl field is in the same table, right?

The following query should get the base url field set for you:
UPDATE urls SET baseurl = Left$(url, Instr(9, url & "/", "/") - 1)

(You see? I couldn't have given you that query without knowing the table and field names, and the fact that the baseurl field was in the same table. If it had been in a different table, the solution would have been very different.)

If you're using Access 2000 or later, and urltype is in the same table, use the following query to set the urltype field:
UPDATE urls SET urltype = Mid$(baseurl, InStrRev(baseurl, "."))

This query returns the frequency of occurrence of base URLs, in descending order:
SELECT baseurl, Count(*) As CountOfBaseURL FROM urls
GROUP BY baseurl
ORDER BY Count(*) DESC
You can use this as the basis for a report with no Sorting and Grouping specifications.


Rick Sprague
Want the best answers? See faq181-2886
To write a program from scratch, first create the universe. - Paraphrased from Albert Einstein
 
Thanks, that was very helpful.
1 quick question. What if I don't want the "." before the urltype and the "/" after the url type?

David Pimental
(US, Oh)
dpimental@juno.com
 
I meant for the first UPDATE to omit the "/" after the baseurl. Did you try it?

To omit the "." before the urltype, change its query to:
UPDATE urls SET urltype = Mid$(baseurl, InStrRev(baseurl, ".") + 1)


Rick Sprague
Want the best answers? See faq181-2886
To write a program from scratch, first create the universe. - Paraphrased from Albert Einstein
 
The removal of the "." works ; but there is still a "/" following the domain. (e.g. "com/" instead of "com".

Any ideas?

David

David Pimental
(US, Oh)
dpimental@juno.com
 
UPDATE urls SET urltype = Mid$(baseurl, InStrRev(baseurl, ".") + 1, Len(baseurl) - InStrRev(baseurl, ".") - 1)

[red]"... isn't sanity really just a one trick pony anyway?! I mean, all you get is one trick, rational thinking, but when you are good and crazy, oooh, oooh, oooh, the sky is the limit!" - The Tick[/red]
 
Sorry, I hadn't updated the Left$ function. It works perfectly now. Thanks for all your help and suggestions.

David

David Pimental
(US, Oh)
dpimental@juno.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top