Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Reading Google Result URLs

Status
Not open for further replies.

brinker

Programmer
May 31, 2001
48
CA
Hi,

I am pretty familar with the Java programming lanuguage, and have written code to read the get the html code of general URLs. However, if the URL takes the form of a google search result (ie) : 4

then my approach fails. The URL is accessible from inside an IE or Netscape browser. Does anybody have any suggestion to how I can accesss these types of URLs using the Java programming lanugage.

BTW, this is a sample of my existing code.

// Broswer.java
// this program uses a JEditorPane to display the contents
// of a file on a web server.

import java.awt.*;
import java.awt.event.*;
import java.net.*;
import java.io.*;
import javax.swing.*;
import javax.swing.event.*;

public class Browser extends JFrame {
private JTextField enter;
private JEditorPane contents;

public Browser(String firstLink)
{
super("Web Browser");
Container c=getContentPane();
JPanel upperPanel=new JPanel();
upperPanel.setLayout(new java.awt.GridLayout(2,1));
enter=new JTextField(firstLink);
final String firstPage=firstLink;
enter.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e)
{
getThePage(e.getActionCommand());
}
}
);
upperPanel.add(enter);

contents=new JEditorPane();
contents.setEditable(false);
contents.addHyperlinkListener(
new HyperlinkListener() {
public void hyperlinkUpdate(HyperlinkEvent e)
{
if (e.getEventType() == HyperlinkEvent.EventType.ACTIVATED)
getThePage(e.getURL().toString());
}
}
);
JScrollPane contentsScrollPane=new JScrollPane(contents);
contentsScrollPane.setVerticalScrollBarPolicy(JScrollPane.VERTICAL_SCROLLBAR_AS_NEEDED);
contentsScrollPane.setHorizontalScrollBarPolicy(JScrollPane.HORIZONTAL_SCROLLBAR_AS_NEEDED);
contentsScrollPane.setPreferredSize(new Dimension (500,500));
contentsScrollPane.setMinimumSize(new Dimension (480,480));
c.add(contentsScrollPane,BorderLayout.CENTER);
//set up a JButton to go Back from
JButton finishButton=new JButton("Start Page");
finishButton.setToolTipText("Click to go Back to the First Page in Browser");
finishButton.addActionListener(new java.awt.event.ActionListener() {
public void actionPerformed(java.awt.event.ActionEvent evt) {
enter.setText(firstPage);
getThePage(firstPage); //ie .load the initial page
}
}
);
upperPanel.add(finishButton);
c.add(upperPanel,BorderLayout.NORTH);


//now set up the initial page.
getThePage(firstLink); //ie .load the initial page
//add a windowlistener
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e)
{
//must close the streams -done before here
setVisible(false);
dispose();

}
}
);

setSize(700,500);
// pack();
show();
}

private void getThePage(String location)
{
setCursor(Cursor.getPredefinedCursor(Cursor.WAIT_CURSOR));
try {
contents.setPage(location);
enter.setText(location);
}
catch (IOException io) {
JOptionPane.showMessageDialog(this,"Error retrieving specified URL",
"Bad URL", JOptionPane.ERROR_MESSAGE);
}

setCursor(Cursor.getPredefinedCursor(Cursor.DEFAULT_CURSOR));
}

}
 
How is it that your approach is failing? Is it hitting the IO exception or is it just pulling up the generic front page for google without doing the search?

Sorry late in the work day, otherwise I would toss it in a file and try it myself :p

-Tarwn ________________________________________________________________________________
Want to get great answers to your Tek-Tips questions? Have a look at faq333-2924
 
When I run a modified version of the above program using threads with join(0), I get an immediate connection timed out error. The above program cannot find any url that involves I am thinking that perhaps google requires a web agent to function with Java, though I do not know how to include one.
 
I think the problem may be that you don't have an URL that points to the page that matches the search result - you have an URL that points to Googles cache of pages. In the case of the specific example you gave, the page could not access its style.css file (no longer in the cache, maybe?), so it immediately returned an error code.

Perhaps you can parse the search result and strip out the actual URL of the result page you want (eliminating the Google address and cache parameters)...
 
No, the file is still in the Google cache. If you try clicking on the html link above in my code, it will take you to the Google page. However, for some reason Java cannot access these pages.

Does anybody know how to trick the Google URLs into seeing a web agent?
 
When I hit the link on this page, I go to a page with a blurb at the top telling what URL should be used to bookmark or link to the page.

That URL is different from yours. Rather than the IP address, it uses instead. A ping of shows a different IP address than the one you're using.

Perhaps you can substitute the IP address for and have it work...
 
When I hit the link on this page, I go to a page with a blurb at the top telling what URL should be used to bookmark or link to the page.

That URL is different from yours. Rather than the IP address, it uses instead. A ping of shows a different IP address than the one you're using.

Perhaps you can substitute the IP address for and have it work...
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top