Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

decode and encode numbers to text 1

Status
Not open for further replies.

RooiVolla

Programmer
Jun 14, 2007
5
ZA
I need to find a way to take 6000 numbers and encode to as little as posible characters. With as little as posible I mean less than 100 characters. Does anyone know if this is possible?? I'm not sure it is, and if not I need to proove it. I've been looking at all kinds of compression and decoding algorithyms, but I can't get less than 860...

 
No, I work for an insurance company and we need to be able to send the brokers a fax with a bunch of codes, they then have to type it in of our application which will then decode the codes into rates. This needs to happen every week.. the actuaries came up with this brilliant plan but they don't know how to do the encoding and decoding....
 
Is the file size a problem?
You seem to be talking about Encryption rather than Compression.


Steve [The sane]: Delphi a feersum engin indeed.
 
I'll try and explain the best i can. We need to update 860 rates and each rate can be up to 7 digits long.

eg.
34522.57 = rate 1
45238.67 = rate 2
.
.
67523.49 = rate 860.

so we need to send a fax with encoded / compresed codes
so we can get the rates from these codes. The codes can use any type of character. So the fax must look something like this:

ADsr34-Edsr2-WsqR-WqaW34
Swa432-EwSdf-WqSW-EwSd43
.
.

this must happen every week, and we can't make the user type in to many characters because it will take for ever. I've seen some other insurance company's do it in as many as 70 characters. My question is, is it posible to encode / compress the rates into codes of less than say 100 characters? if so how will i go about doing it?
 
So each of your 8000 numbers is a decimal value 0 to 99999 with min 2 places of decimal?
If this is the case I think you did well to get this down to 860 alphanumeric chars!




Steve [The sane]: Delphi a feersum engin indeed.
 
And you are sure you want your users to have to type these sequences every day? I'd try to convince my users to do this once a year, and then they'll complaint, but doing this every day is a nightmare!

I'd go for some file exchange/import to deploy the data, giving them some login code to verify their credentials, and then release a string/file to be imported into the app.
Or just send the appropriate ppl an e-mail with a file attached. The content of that file could be encoded in the afore mentioned way, that's just to be sure nobody else can use your data, and you can even apply a personal key to the encoding if required.

The fax is a really ancient way of communicating this kind of data! we've grown out of the stone-age I hope?

HTH
TonHu
 
The problem is about 5% of the users does not have access to the internet or email, they live in very small towns in South Africa.. Thats why we need the fax option. for the other 95% we can update it online - no problem.

The rates we need to update are the latest rates for invetment profolios(these are related to the world wide money markerts), so if the client signed the quote for a spesific rate we need to honour the Quote - thats why the updates need to happen every week.. after a week the quote becomes invalid.
 
Well, in that case I'd at least make the data non-case-sensitive, and also treat O and 0 (letter o and number zero) as the same, so if they typed an entire line with caps-lock in the wrong position, they don't have to re-enter the weird data-sequence.

Probably the very small towns in South Africa are close to the stone-age, then... :-(

HTH
TonHu
 
Interesting problem, actually I might play around with it and see what I can come up with.

But to just make sure I understand, your problem is to compress/encode up to 860 numeric text values so it will take up the minimum amount of text?

Criteria:
1. Always 7 digits, with 2 decimal places?
2. Does the decimal have to be transmitted with the data or is it implied? Example for #1 and #2: Can I have something like 12367 or 123.67 or 0012367?
3. Does binary data get transmitted (like longints or currency), or is it always some form of text?
basically I'm wanting to know what the input data to the encoding side looks like

Process:
1. A program on your end will encode these values, then the values are transmitted (by your program or manually) via fax to another place where...
2. A person on the other end receives the fax, then inputs these data into another program in a computer they have there which decodes the text into the original rates?
 
More information please. Perhaps the range of each rate is not from 0.01 to 99999.99? Is there a smaller range? This would help. Is it possible that a rate could exceed 99999.99 in the future? If so, then this exercise could be an entire waste of time depending on the encoding scheme we might come up with.

TonHu, it won't matter how complex or hard to read the fax is, because we can always include check digits every 10-15 characters. We can use any readable and typeable symbol available to them and us.

Considering that RAR and ZIP can only manage around 50% compression, I'd say you've done exceedingly well to get more than 80% compression.

How do you know that other insurance companies have achieved such amazing encoding? A better avenue might be to 'find' out their algorithm. Have you ever seen one of their faxes? One shouldn't be too hard to find. The same offices you're faxing to might have one they can fax back to you.
 
Glenn9999 to answer your questions:

1. yes always 7 digits with 2 decimals.
2. nope the digit does not have to be trasmitted, you can bargan on the fact that digits 6 and 7 are the decimals.
3. the input data looks like this:

3747563000684700947375594827592834515869493
5748394473284832882845365684354835439458732
.
.

which means:
rate 1 = 37475.63
rate 2 = 00068.72
rate 3 = 00473.75
.
.

we write the application that encodes/compress and decodes/decompress this.

your process steps is 100% correct.

Griffyn:

nope the rate will never exeed 99999.99 not in my life time anyway..

Jip I've got examples of these faxes from 3 of our compeditors, but none of them wants to share their algorithm..
 
I thought about sorting and encoding an index and the difference between each successive element, but there's no way it will ever reduce to less than a single character per rate. Certainly not 8-10 rates in a single character. I boldly state that such a thing is impossible. At best I could probably manage around 2500 printed characters for 860 x 7 digit numbers using the index and difference approach.

The simple and obvious method would be to encode each rate to a 3 byte array, put them all together and then run a MIME encoder over them. This would come to about 3200 printed characters.

How about we look at these actual numbers. What are they? And how are they generated? Is there any chance we can predict what one will be based on other rates, or rates from the previous day? Maybe your competitors are simply faxing a list of changes to each rate? This would drastically reduce the amount of data needed to be transmitted, especially if many rates stayed the same.
 
I ended up testing longint values of maximum 7 digits, I noticed the first byte of the longint wasn't used so I stripped those. The resulting 2560 byte file did not compress using a ZIP file utility (good sign we're not getting it any smaller). And to text encode it, it will increase it roughly 20%, to around 3000 bytes at minimum.

I tried binary coded decimal, and got similar results.

My guess is that they're only sending changes through and not the whole list. You might end up coding "rate # change", "rate value" pairs.
 
And as another thought, instead of sending a new rate, you could send a value representing the change in the rate. If it goes down 3.00 then -3.00, if up 12.63 then send +12.63. The smaller the numbers, the more that can be stripped out.
 
Can you please give a very clear instructions what should be done. Graphically explain could be a good resource.

What numbers are most common, or is there not any? etc.

As far as I've understood it, the raw and unsophistiated solution would be just to convert the numbers to base 255. And then transmit the data.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top