Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations derfloh on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Search IP in a subnet range

Status
Not open for further replies.

a43131

Programmer
Joined
Nov 2, 2007
Messages
2
Location
US
I am trying to work on a program which searches for an IP in the file and tells if its belongs to a range ?
Example

"202385408","202385919","PR"
"202385920","202386431","US"
"202386432","202387455","VI",
"202387456","202517983","US"
"202517984","202517991","PR"
"202517992","202706431","US"
"202706432","202706943","PR"

So entering 202386435 should return "VI" [ Line 3 ]
and entering 202387460 should return US [ Line 4]

I need to do this using a binary tree or similar data structure since sequential search in a file with 1000,000 records take forever.

Any thoughts would be really appreciated.

Thanks
 
I would suggest putting that data in a DB then.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
I created a test file a format like this
Code:
1,20,test1
21,40,test2
41,60,test3
61,80,test4
81,100,test5
101,120,test6
121,140,test7
141,160,test8
..
19999861,19999880,test999994
19999881,19999900,test999995
19999901,19999920,test999996
19999921,19999940,test999997
19999941,19999960,test999998
19999961,19999980,test999999
19999981,20000000,test1000000
so 1 million lines
and using simple code like this
Code:
open(FILE, "<test.txt");
while(<FILE>) {
	@tmp = split /\,/, $_;
	if ($ARGV[0] >= $tmp[0] && $ARGV[0] <= $tmp[1]) {
		print "$tmp[2]\n";
		last;
	}
}
close FILE;
it takes 8 seconds to find a match at the end of the file.

I'm not sure what your doing but 8 seconds isn't bad and I'm sure someone on here can make it go faster.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]
Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;
 
Unless you are searching for many IPs at once there may be no advantage to creating a binary tree structure before doing the actual search.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Are your ranges all consecutive as in your example?
Are your ranges all numerically sorted in the file as in your example?
Are your ranges all represented by integers that can fit into a Perl 32 bit integer as in your example?
Are the lines in your file all the same length as in your example?
Assuming four positive answers (but only the second one is essential to the code below), you can simply use a dicotomic (binary) search routine:
Code:
print find(202386435);
sub find{
  use integer;
  local(*F);
  my$LON=30;
  my($ser)=@_;
  open(F,"test.txt");
  my$dim=-s F;
  my$lin=$dim/$LON;
  my$locplus=--$lin;
  my$locminus=0;
  my$loc=$lin/2;
  my($ini,$fin,$cla);
  seek F,$loc*$LON,0;
  while(<F>){
    ($ini,$fin,$cla)=split',';
    $ini=eval$ini;
    if($ser<$ini){
      return"Not found!"if$loc<=0;
      $locplus=$loc;
      $loc=($loc+$locminus)>>1;
    }else{
      return eval$cla if($ser<=eval$fin);
      return"Not found!"if$loc>=$lin;
      $locminus=$loc;
      $loc=($locplus+$loc)>>1;
      $loc++;
    }
    seek F,$loc*$LON,0;
  }
}
Of course this will take a tiny fraction of a second on a 1 million records file.
[tt]$LON[/tt] should be set to the line length of your file, including the line terminator (recalling that under Windows the line terminator is 2 chars).
If line lengths are non constant, it can be done in quite the same way, just a bit trickier.
Of course it would be a bit faster if there were no quotes in the file.
Note that I didn't test what would happen with this code if the ranges were not consecutive or (worse) were overlapping: some more error checking should be added.
Last note: of course the dimension of the file should fit into a perl integer (~2 GB).


Franco
: Online tools for structural design
: Magnetic brakes for fun rides
: Air bearing pads
 
prex1:

Answer to all the questions is Yes. I haven't yet tried this routine but seems to solve to the problem.

Infact, solutions proposed above also solved my issue. The thing I was doing wrong was inserting a million sequential elements in a binary tree which was resulting in inserts only on the left node which resulted in very long recursive times. Lame I know.

Thanks a lot for the solutions guys. This is really helpful.

Aashish
 
Thats pretty cool Franco.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top