Search IP in a subnet range

a43131 · Nov 2, 2007

I am trying to work on a program which searches for an IP in the file and tells if its belongs to a range ?
Example

"202385408","202385919","PR"
"202385920","202386431","US"
"202386432","202387455","VI",
"202387456","202517983","US"
"202517984","202517991","PR"
"202517992","202706431","US"
"202706432","202706943","PR"

So entering 202386435 should return "VI" [ Line 3 ]
and entering 202387460 should return US [ Line 4]

I need to do this using a binary tree or similar data structure since sequential search in a file with 1000,000 records take forever.

Any thoughts would be really appreciated.

Thanks

travs69 · Nov 2, 2007

I would suggest putting that data in a DB then.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]

Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;

travs69 · Nov 2, 2007

I created a test file a format like this

Code:

1,20,test1
21,40,test2
41,60,test3
61,80,test4
81,100,test5
101,120,test6
121,140,test7
141,160,test8
..
19999861,19999880,test999994
19999881,19999900,test999995
19999901,19999920,test999996
19999921,19999940,test999997
19999941,19999960,test999998
19999961,19999980,test999999
19999981,20000000,test1000000

so 1 million lines
and using simple code like this

Code:

open(FILE, "<test.txt");
while(<FILE>) {
	@tmp = split /\,/, $_;
	if ($ARGV[0] >= $tmp[0] && $ARGV[0] <= $tmp[1]) {
		print "$tmp[2]\n";
		last;
	}
}
close FILE;

it takes 8 seconds to find a match at the end of the file.

I'm not sure what your doing but 8 seconds isn't bad and I'm sure someone on here can make it go faster.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[noevil]

Travis - Those who say it cannot be done are usually interrupted by someone else doing it; Give the wrong symptoms, get the wrong solutions;

KevinADC · Nov 2, 2007

Unless you are searching for many IPs at once there may be no advantage to creating a binary tree structure before doing the actual search.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

prex1 · Nov 3, 2007

Are your ranges all consecutive as in your example?
Are your ranges all numerically sorted in the file as in your example?
Are your ranges all represented by integers that can fit into a Perl 32 bit integer as in your example?
Are the lines in your file all the same length as in your example?
Assuming four positive answers (but only the second one is essential to the code below), you can simply use a dicotomic (binary) search routine:

Code:

print find(202386435);
sub find{
  use integer;
  local(*F);
  my$LON=30;
  my($ser)=@_;
  open(F,"test.txt");
  my$dim=-s F;
  my$lin=$dim/$LON;
  my$locplus=--$lin;
  my$locminus=0;
  my$loc=$lin/2;
  my($ini,$fin,$cla);
  seek F,$loc*$LON,0;
  while(<F>){
    ($ini,$fin,$cla)=split',';
    $ini=eval$ini;
    if($ser<$ini){
      return"Not found!"if$loc<=0;
      $locplus=$loc;
      $loc=($loc+$locminus)>>1;
    }else{
      return eval$cla if($ser<=eval$fin);
      return"Not found!"if$loc>=$lin;
      $locminus=$loc;
      $loc=($locplus+$loc)>>1;
      $loc++;
    }
    seek F,$loc*$LON,0;
  }
}

Of course this will take a tiny fraction of a second on a 1 million records file.
[tt]$LON[/tt] should be set to the line length of your file, including the line terminator (recalling that under Windows the line terminator is 2 chars).
If line lengths are non constant, it can be done in quite the same way, just a bit trickier.
Of course it would be a bit faster if there were no quotes in the file.
Note that I didn't test what would happen with this code if the ranges were not consecutive or (worse) were overlapping: some more error checking should be added.
Last note: of course the dimension of the file should fit into a perl integer (~2 GB).

Franco

http://www.xcalcs.com

: Online tools for structural design

http://www.megamag.it

: Magnetic brakes for fun rides

http://www.levitans.com

: Air bearing pads

a43131 · Nov 3, 2007

prex1:

Answer to all the questions is Yes. I haven't yet tried this routine but seems to solve to the problem.

Infact, solutions proposed above also solved my issue. The thing I was doing wrong was inserting a million sequential elements in a binary tree which was resulting in inserts only on the left node which resulted in very long recursive times. Lame I know.

Thanks a lot for the solutions guys. This is really helpful.

Aashish

KevinADC · Nov 3, 2007

Thats pretty cool Franco.

------------------------------------------
- Kevin, perl coder unexceptional! [wiggle]

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Search IP in a subnet range

a43131

Programmer

travs69

MIS

travs69

MIS

KevinADC

Technical User

prex1

Programmer

a43131

Programmer

KevinADC

Technical User

Similar threads

Part and Inventory Search

Sponsor