INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Jobs

garbling a name

garbling a name

(OP)
I have been asked to create a program to 'hide' the actual names in a certain field in a VFP table. Thus I would replace the existing string of characters by a new string of random chrs.

thus far I have the code to loop through each chr in the string but I am not able to do the replacement chr by chr. My first attempt is a function thus

CODE

Function garble(oldchar)
LOCAL charnum
LOCAL newchar
LOCAL newcharnum
charnum=INT(ASC(oldchar))
newcharnum = charnum * RAND()
newchar= CHR(newcharnum )
Return newchar 
this of course produces a range of characters not all of which are in the range A-Z.
How will I create replacement chrs in the range A-Z?
Many thanks

gendev

RE: garbling a name

To create a random letter A-Z all you need is CHR(ASC("A")+INT(RAND()*26)) or CHR(65+INT(RAND()*26)).
Just ensure you initialize the random number generator with a different seed every program start by once calling RAND(-1).

The idea to input the old character makes no sense to me.

Bye, Olaf.

RE: garbling a name

(OP)
Many thanks Olaf!

Gendev

RE: garbling a name

One idea to make sense of the old character: Only return a new random letter, IF ISALPHA(oldchar).

CODE

FUNCTION garble(tcOldChar)
LOCAL lcNewChar
IF ISALPHA(tcOldChar)
   lcNewChar = CHR(65+INT(RAND()*26))
ELSE
   lcNewChar = tcOldChar &&keep the non letter
ENDIF
Return lcNewChar 

I would rather do it in one go with the full name instead of each char and include the loop over all name string positions in the function. It's overkill to make a call per character as that has an overhead for running very little net code each time.

Bye, Olaf.

RE: garbling a name

(OP)
Thanks again

Gendev

RE: garbling a name

Why not hash it?
Make it into a 32 character hash.

Regards

Griff
Keep Smileing

There are 10 kinds of people in the world, those who understand binary and those who don't.

I'm trying to cut down on the use of shrieks (exclamation marks), I'm told they are !good for you.

RE: garbling a name

If it is for anonymous display garbage letters are still better for layout/display and less distracting or annoying than a hash value. For anonymization, I would go a different route of using mockup data, eg as you can get provided from https://www.mockaroo.com/ and then simply fill in random first and last names of mackup data in place for the real names.

It's true, that there are different solutions for different purposes, if a non-technical customer actually meant encrypting data, neither hash nor random letters would be a good solution.

Bye, Olaf.

RE: garbling a name

If the aim is to create test data, that is, data that is completely fictitious but nevertheless looks realistic, this is what I do:

I have a bunch of tables, each each containing ten rows. One table contains common first names, another had common family names, another has cities, and so on. I then do a cartesian join of the relevant tables. With four such tables, this gives 10^4 rows, each completely different. I then select the required number of rows at random from that result set.

For example, if I want a 100 rows, containing first name, last name, street address and city, I would do this:

CODE -->

SELECT ;
  FirstName, LastName, Street, City, RAND() AS Selector ;
  FROM FirstNames, LastNames, Streets, Cities ;
  INTO CURSOR csrTemp

SELECT TOP 100 FROM csrTemp ORDER BY Selector ;
  INTO CURSOR csrFinal 

However, now that Olaf has told us about Mockaroo, I probably won't do that any more. Mockaroo seems to offer a great deal more data, with more realistic ways of combining it. For example, not only does it give you cities and countries, but the cities are real cities in the corresponding countries. Even the international phone dialling codes match the countries. I don't mean this to sound like a commercial, but I wish I knew about Mockaroo years ago.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads

RE: garbling a name

Quote (gendev)

create a program to 'hide' the actual names in a certain field in a VFP table

Obviously you wouldn't be storing the data in the first place if you didn't need it in its original form at some point in time.

With that in mind, you should probably differentiate between SHOWING 'garbled' data in a user screen and possibly 'garbling' the data in the table field (which would be Encrypting such that it could be Decrypted when needed).

Merely replacing table field values with random characters will not be un-doable if needed.

Good Luck,
JRB-Bldr




RE: garbling a name

jrbbldr, you're right,

but you could fork (if/else) between SELECT name,... FROM table and SELECT garbled(name) as name FROM table and that will not change any data, just show random letters instead. You also could jumble the data for the usecase of handing it out to external developers without giving them real data, it does not necessarily has to be done to the original data itself, but to a developer database copy or extract, at the same time perhaps shrinked in size.

Bye, Olaf.

RE: garbling a name

Hi,

Apart from Olaf's solution you can also make use of this function to scramble the content of your database

CODE --> vfp

Function scrambling
Parameters tcIn, tlScramble
*!* function to scramble the content of fields into your cursor.
*!* this is NOT a decryption, simple a low-level scrambling.
*!* to scramble:
* Select id, scrambling(name,.T.) as name into myCursor nofilter
*!* to unscramble:
* select id, scrambling(name,.F.) as name into myCursor nofilter

Local lcDecrypt As String, ;
	lcEncrypt As String, ;
	lcLet As String, ;
	lcScram As String, ;
	lnPos As Number, ;
	lnPosition As Number

Local lcIn As String

*!* the constants DECRYPTY and ENCRYPTY are shown here below the code

lcScram = []

Do Case
Case Vartype(m.tcIn) = 'C'
	lcIn = Alltrim(m.tcIn)
Case Inlist(Vartype(m.tcIn),'N','Y')
	lcIn = Alltrim(Transform(m.tcIn))
Case Vartype(m.tcIn)= 'D'
	lcIn = Alltrim(Dtoc(m.tcIn))
Case Vartype(m.tcIn)='T'
	lcIn = Alltrim(Ttoc(m.tcIn,1))
Endcase
If m.tlScramble = .T.
	For lnPos = 1 To Len(m.lcIn)
		lcLet = Substr(m.lcIn,m.lnPos,1)
		lnPosition = At(m.lcLet, DECRYPTY)
		lcScram = m.lcScram+Substr(ENCRYPTY ,m.lnPosition,1)
	Endfor
Else
	For lnPos = 1 To Len(m.lcIn)
		lcLet = Substr(m.lcIn,m.lnPos,1)
		lnPosition = At(m.lcLet,ENCRYPTY )
		lcScram = m.lcScram+Substr(DECRYPTY ,m.lnPosition,1)
	Endfor
Endif

Return m.lcScram 

#Define DECRYPTY "abcdefghijklmnopqrstuvwxyzABSCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#"
#Define ENCRYPTY "Ï3ÍTUVÂë56éWXYêËQÛÇBÉZaÚ4ÙbÌÎöfghijkó7pq8HIâÄÔáASCÁ90òMwxyNOPÒRSîúùnorstûÜç@.-&#GÀèJKLôÖz12äDEFcdelmuvàÈÊïíìÓü"

Jockey2

RE: garbling a name

(OP)
Hi Jockey2

Thanks for your code.
I'm trying to use it as a test but I can't get the #DEFINEs to kick in.
I've tried putting them
-at the top of main.prg
-at the top of the prg which uses the function
-within the function declaration.

None work - I haven't used DEFINE before so I am at a loss what to try next.
Can you help?

Gendev

RE: garbling a name

DEINES have to be before the usage of the constants, that's the only condition about them.

If you want to use this code, you want to encrypt and decrypt names, not just "garble" them, then better make use of crypto API or -what's simpler to use, vfpencryption.fll

https://www.sweetpotatosoftware.com/blog/index.php...

Especially, if it's about HIPAA compliance it's not just about "garbling" names somehow, but algorithms to be used are clearly specified.

Bye, Olaf.

RE: garbling a name

(OP)
Olaf,
I was keen to just try the code.
The user has explained he wants a much more complex solution whereby names will be 'garbled' the same way whenever they occur in the field so that they can sort as normal in the genealogy application.
I'm not able to code that so I have bowed out.
Many thanks

Gendev

RE: garbling a name

Garbling and keeping sort order is impossible, unless you SELECT garbled(originalname) as garbledname and ORDER BY originalname, and that's as simple as that, you still have the original name at hand when you query. But of course the garbled names will not be in alphabetical order. The difference in garbling as creating random names and en/decrytion or en/decoding is, the latter ways enable to get the original name from the decryption/decoding, while random new names are the only real good way of anonymizing names.

Jockeys code works as is, I don't know what you did wrong about the constant, maybe just try again to copy and paste.

CODE -->

? scrambling("Olaf",.T.) && HWÏV
? scrambling(scrambling("Olaf",.T.)) && Olaf 

It could be simplified using CHRTRAN(lcText,DECRYPTY,ENCRYPTY) for encrypting and CHRTRAN(lcText,ENCRYPTY,DECRYPTY) for decrypting. It is a bit better than Cesar encryption but you provide the translation table with the code, so by definition this rather is an encoding than an encryption. You'd need to keep the ENCRYPTY constant secret to disable decoding, only then it becomes an ancryption method. There still are hints remaining in encryted names, eg as the letter E is most frequent (also in names) you can find what encryption character corresponds to E, etc. Since each character has an encryption character independent from any other, you can decrypt letter by letter. Having many such "encrypted" names you can also find the ENCRYPTY constant, even if the code would not reveal it. This is a big weakness. But Jockey also says so in the comment. I'd specify this is not an ENcryption. You could also argue it's not a scrambling, as it doesn't shuffle letters, any initial "a" always is encoded as "Ï", so it neither simply changes position as in shuffling, nor is the encoded letter always differing, that would require the constants to change. In an encryption the "scrambling" also will depend on a keypair and/or password, also that are differences making this a mere encoding, though not an industry standard encoding as ANSI codepages are or as base64 is. It's just a mapping of original and encoded characters.

Bye, Olaf.

RE: garbling a name

Hi Gendev

your question:

I'm trying to use it as a test but I can't get the #DEFINEs to kick in.

Suggest you copy the define's as given in my thread below the code into the code instead of the remark
*!* the constants DECRYPTY and ENCRYPTY are shown here below the code

The only reason I have put them in this code 'outside' the code since when I put them inside a code block here ar TekTips the content seems to be garbled.

Is not clear, please report back.

BTW my code will, I suppose, apply to your users's requirements as stated in your last message to Olaf.

Take care: This is NOT a decryption code in anyway it is a 'garbling' code, or as Olaf says an 'encoding' code, it simply replaces a character with an other as given in de constant ENCRYPTY and changes back as given in the constant DECRYPTY. So you can index.

Regards,

Jockey2

RE: garbling a name

>garbage letters are still better for layout/display and less distracting or annoying than a hash value

Could always apply a simple Base64 encoding to turn the hash into garbled characters ... in fact just apply the Base64 encoding to the original text - that should provide sufficient garbling AND vaguely maintain sort order, and it is reversible.

RE: garbling a name

It's a bit questionable, if it should be reversible, and if so, then most probably not by a standardization of the encoding, but using real encryption like AES, which needs the knowledge of knowing the necessary password or having the necessary key or certificate and/or the access to it also limited by permissions.

And anonymization would mean, you want to prevent the possibility to know the original name.

As gendev bowed out of this job, I guess we'll never know the exact conditions.


Base64 is indeed just spreading bits, adding in 00 bits every 6 bits of the original bytes, 3 original bytes (24 bits) become 4x6bits+4x2bits = 4 new bytes. It would keep the binary sort order, as the binary code is indeed just shifted.

It only visually hides names. I couldn't decode just in my head, but take a copy of the "garbled" base64 encoding and put it into STRCONV and you'd have the clear text name. Bad idea.

? STRCONV("Olaf",13) && T2xhZg==
? STRCON("T2xhZg==",14) && Olaf

Just because something looks encrypted, it isn't encrypted. Once you see something with mainly just letters and numbers you can simply make the experiment to base64 decode it and are likely successful.

If it was just about visual appearance you could also simply display the hex representation, but I kind of would be able to detect many letters by knowing A=0x41, B=0x42, ... for example. You could also replace all name with * or x. Obfuscation can be much simpler.

Bye, Olaf.

RE: garbling a name

Sure; I'm not arguing the pros and cons of proper encryption. Nor suggesting that hashing, or Base64 encoding is a substitute for encryption. Just that they *might* be viable for the OP, who has not as yet really stated whether they simply want obfuscation or not (and that they might be a solution that is within their coding skills, so that they could bow back in)

RE: garbling a name

Hi Strongm,

I doubt if you apply a real encryption the requirement quote whereby names will be 'garbled' the same way whenever they occur in the field so that they can sort as normal in the genealogy application. unquote would work, I dont think so.
As Olaf pointed out "Olaf" will always to transformed to "HWÏV"

Regards,

Jockey2

RE: garbling a name

>I doubt if ...

I am aware of that. I am not suggesting that real encryption would result in a sortable set of results consistent with the original data.

>As Olaf pointed out "Olaf" will always to transformed to "HWÏV"

Not sure what point you are trying to make here.

>Take care: This is NOT a decryption code in anyway it is a 'garbling' code, or as Olaf says an 'encoding' code, it simply replaces a character with an other as given in de constant ENCRYPTY and changes back as given in the constant DECRYPTY. So you can index.

I am afraid that I have to disagree with you (and Olaf). Your code is an implementation of what we call a monoalphabetic substitution cipher. By today's standards a very weak method of encryption (since your method uses a mixed alphabet it is somewhat stronger than a classic Caesar cipher, but weaker than a polyalphabetic cipher such as the Vigenere cipher), but a method of encryption (and decryption) nevertheless.

RE: garbling a name

It only would be a real Cesar cipher (and it is despite the longer alphabet), if you wouldn't provide the constants but the shift (rotation) offset as the "password". Letting ENCRYPTY be the scrambled letters, it is already a bit stronger than a normal Cesar cipher, but it can only be considered encryption, if the algorithm isn't known at all, it has to be secret how the encryption is done, that and also it resulting in always the same output (unless you change ENCRYPTY) rather makes it act as an encoding than as an encryption.

So let's say I apply other (stricter) criteria. I also don't consider Cesar cipher a "cipher" or encryption anymore. You just need 26 tries to break it, if you know the method. That makes it less deterministic and 1:1 mapping as a normal encoding is, but it can't be considered safe. Every O becomes an H, if you don't change ENCRYPTY and hat's the nature of an encoding.

Jockeys code could be put into the direction of an even rather strong encryption if you do exactly that and let the ENCRYPTY value change perhaps after each single character translation, perhaps controlled by a password as a second input besides the text to be encrypted. Perhaps with some sand random noise. If that is made movable there is no 1:1 mapping of original/encoded character. It could be done in a deterministic way also reproducible for decrypting.

Bye, Olaf.

RE: garbling a name

Hi Strongm,

sorry but your quote I am afraid that I have to disagree with you unquote seems to me you are missing the essence.
The code I have shown is NOT an encryption, as stated several times. You may call it a "monoalphabetic substitution cipher" which is fine for me, however it is a simple transformer meaning it transforms a letter into an other letter, thus Olaf will become HWÏV which is sometimes enough for peaking eyes not to see at a glance HWÏV is acutaly Olaf. And since the transforming is done consequently you can also, a requirement, meaningfull index.

Please read the requirements Gendev made in his initial request, nowhere he is asking for an encryption procedure, he wants to 'hide'.

Regards,

Jockey2

RE: garbling a name

> it is already a bit stronger than a normal Cesar cipher

I believe I said that

> it can only be considered encryption, if the algorithm isn't known at all
You may want to share that view with the cryptographic community. They may disagree with you. The algorithm for Rijndael/AES, for example, is well known; it is even published as a standard. Modern cryptography is based on the principle that a cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

Knowing that a message is encrypted with a Caser cipher, and knowing the Caesar cipher algorithm doesn't render the message immediately readable - since you still need to know the key (i.e. the shift). Sure, this is pretty easy to break (as we've already agreed, Caeser ciphers are weak), but just because encryption can be broken doesn't mean that it is no longer encryption - just means that you might not want to use it ...

>You just need 26 tries to break
25 (26th is plaintext already). Or, more generically: n -1, where n is the size of the alphabet

>let the ENCRYPTY value change perhaps after each single character translation
Which would mean you now had a running key polyalphabetic cipher, so stronger than Vigenere - and if the running key were truly random, and at least as long as the plain text, and you only ever used it once, then you'd have one-time pad ...

RE: garbling a name

>You may want to share that view with the cryptographic community. They may disagree with you.

You misinterpret what I said and turned the words in my mouth. I didn't say this is the essential property a good encryption algorithm has to have. I said this to point out it is a bad property of Jockey's code and makes it disqualify as encryption. This all was just to argue against your categorization as such.

Yes, this is a bad property of Jockeys cypher. And that's why your argument about it being an encryption fails. Also Jocke did never intended to write an encryption. Since you are as knowledgeable about encryption and perhaps even more than I, this should have helped you see your arguing for categorizing Jockey's algorithm as encryption is wrong. Cesar cypher in itself also is not an encryption applying today's criteria, as indeed it should be possible to publish an algorithm and use it as is without compromising encrypted data, i.e. the knowledge of the algorithm does not make encrypted data decryptable. If you see the ENCRYPTY value as a key, you might consider it going in that direction, but as that key needs to be composed of all characters that only makes the way these characters are permutated the real key. It's still a vast range of possible keys, but providing it with the code itself that makes it no key.

I would have to verify, but maybe it is even more generally true for any cipher or chiffre to be no encryption, but something in between encryption and any usual canonically straight forward encoding. With that I want to say the usual intention of an encoding is surely not to obscure data, but to map a more or less large character set to some byte codes.

Bye, Olaf.

RE: garbling a name

Strongm,

Did you not read what the explication / remark of the procedure says, 1st sentence ?
Please donot compare / judge this procedure in any way with encryption. It has nothing at all to do with encryption.

Jockey2

RE: garbling a name

>The code I have shown is NOT an encryption, as stated several times

Stated by you, yes. But so what? Forgive me, saying so does not make it so. You have simply reinvented a classical substitution cipher, eg http://practicalcryptography.com/ciphers/simple-su....

> Also Jocke did never intended to write an encryption
Doesn't matter what Jocke intended. See my comment and link above. Sure, the inclusion of the key makes it child's play to break, but it is encryption nonetheless.

And let's go a step further: encryption is simply some process (i.e. an algorithm) to make information hidden or secret. And to make that process useful, you need some code (or key) to make information accessible. It takes no account of how easy it is to determine the key or reverse the process. The Caesar cipher is still encryption (the key being how many characters we shift), even if it is very, very easy to find that key.

The most simple definition of encryption, though, is that it is the process which converts information or data into a code. And perhaps this is the source of the confusion - encryption is indeed a form of encoding. But not all encoding is encryption. The examples used/discussed in this thread, however, are all encryption


>If you see the ENCRYPTY value as a key
It is a key (a slightly broken key, because the alphabet it is derived from is broken - try encrypting and then decrypting "ääää")

> providing it with the code itself that makes it no key
No, it simply makes it easy to retrieve the plaintext - but it is still a key. Note that if I foolishly publish the key I use to encrypt with AES in ECB mode, then it is easy to recover the plaintext (AES is symmetric) - but that doesn't stop the key from being a key, nor does it disqualify AES ECB as encryption

>it can only be considered encryption, if the algorithm isn't known at all
>it is a bad property of Jockey's code
The issue isn't the algorithm, the issue is the inclusion of the key that is in use. Can't argue with that. But, as I said above, if you expose the key then even AES is equally easily compromised - but that doesn't make AES a bad algorithm (or set of algorithms), nor stop it from being considered encryption.

>You misinterpret what I said
Yep, looks like I may have done, so apologies for that. However, I still don't agree with your conclusions even with the misunderstanding cleared up (see points above)

>Cesar cypher in itself also is not an encryption applying today's criteria
Yes, it is. Encryption has a definition in cryptography, and the Caeser cipher meets that definition. It just isn't very secure anymore. You wouldn't go around trying to say that a medieval shield was no longer a shield just because it isn't very useful on a modern battlefield.

RE: garbling a name

Strongman,

I noticed your above reply, but sorry I stopped reading after the first ?.
It seems that you are seeking for a being right although you know you are not.
I have stated the code is not an encryption.
If you are willing to pull all kind of excuses and proofs that my coding is not encryption than you are correct, although this is useless as I already told you so.
I did not reinvent any new wheel I just showed the OP how to do this classical substitution in VFP. The code is so basic it is not even worth to discuss, it works and that what counts.
So please stop now being the wise guy and stop to tell us my coding is not an encryption as it is not.

Regards,

Jockey2

RE: garbling a name

Well, OK, never mind strongm, interesting thoughts, but by means of all the definition of encryption, a key is part of it, when major encryption categories are differentiating between symmetric (one secret key) and asymmetric (public/private key pair) algorithms and no other category exists.

We could agree on it being a symmetric cryptographic encryption with its key given within the code, which makes it an open secret. There's no such thing as a public key in a symmetric algorithm. We can stop to quarrel about definitions here, I'll simply agree with you. The way it's provided it's still a lock with the key stuck in it.

You'll also have to agree, that Jockey is right in his comment about the warning this is not good to use for encryption, even if it is an encryption and thereby give Jockey the grace of still making a correct classification of the usability limits of this. It is quite nice for garbling names in a recoverable manner, results in something not really readable, but still at least printable and not cluttered with control codes or anything else coming from a normal binary encryption. And it's reversible, which is or is not intended, we'll most probably never know.

If you make it so the replacement characters are all having a higher byte code than the original characters, it could even be used to keep the sort order of garbled names. For that to work with a range of letters, you would rather end up with Cesar cipher. Including several printable characters after the letters, you could just leave some gaps, if you reduce the original character set to make it a bit more complicated than just the shifted up alphabet or alphabets (if you make a distinction of small and capital letters). That way it could end up looking quite like base64.

Bye, Olaf.




RE: garbling a name

> I stopped reading after the first ?.

A pity, since that presumably means you couldn't be bothered to follow the link to an independent 3rd party cryptography expert (one of many, many) that makes it clear that a substitution cipher, using your exact method, is encryption. Here's another one, from the Computer Science department of the University of Rhode Island. If you looked at them, you wouldn't have to take just my word for it, which is lucky because you've clearly taken umbrage with me at a minor correction in terminology. Not quite sure why, given that at no time have I suggested that your code is an inappropriate solution for the OPs stated requirements.

>it works and that what counts
Again, never said otherwise (although I grant that did point out that there is a flaw in your ENCRYPT and DECRYPT strings which breaks the symmetrical encryption/decryption under certain circumstances - but not with the algorithm itself)

>excuses
Excuses? What excuses? Are we suffering from a language barrier here?

>although you know you are not.
Oh dear. No, quite the contrary. I do some of this for a living. But I know you have decided you don't like anything I say, which is why I have provided independent links that confirm the argument I am making.

>I have stated the code is not an encryption
So, if I state that grass is red, that makes it red, does it? The point is that the facts don't support your statement. Let me ask a question - if I used your algorithm to 'scramble' some plain text - but using a different ENCRYPT (something Olaf alluded to earlier) - and gave you that 'scrambled' text, would you easily be able to 'unscramble' to the original plain text? Or would you somehow need to gain access to or figure out what ENCRYPT was? Mind you, you probably have not read this far ...

laËÊRôçqG

RE: garbling a name

Strongm,

Thanks, good catch!

find below the corrected strings:
#Define DECRYPTY "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûüÜÚÙÛÇç@.-&#"
#Define ENCRYPTY "Ï3ÍTUVÂë56éWXYêËQÛÇBÉZaÚ4ÙbÌÎöfghijkó7pq8HIâÄÔáAsCÁ90òMwxyNOPÒRSîúùnorstûÜç@.-&#GÀèJKLôÖz12äDEFcdelmuvàÈÊïíìÓü"

Jockey(20

RE: garbling a name

Olaf, pretty much agree with all your points in your last post.

Jocky2, I think you may need to change DECRYPT as well, to "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#"

RE: garbling a name

Strongm,

I have editted the constants in my message 22.8 20:15
Thanks for the remark about the incorrect constant.
I suppose (hope) it now works as expected.

Regards,

Jockey2

RE: garbling a name

(OP)
Thanks to all and especially jockey2 who got me on track with a scrambling routine that satisfies my client. He created new ENCRYPT and DECRYPT strings to do the job in his particular way.
I am now trying to 'functionise' my long-winded prg which loops through 29 tables in a project.
I am unable to pass through a fieldname as a parameter to get the work done in an efficient prg.
My function works within a Do while !(eof) loop and is called by
thefield = 'firstname'
Do fielding With thefield

CODE

Function fielding
Parameters cmyfield

Go Top
Strtofile(cmyfield+Chr(013)+Chr(10),('garble.log'),.T.)
Do While !Eof()

	fcontent =  Alltrim(cmyfield) && original field contents 
	fncontent = scrambling (fcontent )&& scrambled field contents 

	cMessageText = Str(Recno())+'  ' +fcontent+'  '
	Strtofile(cMessageText+Chr(013)+Chr(10),('garble.log'),.T.)
	cMessageText = Str(Recno())+'  ' +fncontent
	Strtofile(cMessageText+Chr(013)+Chr(10),('garble.log'),.T.)
	If ldoit
		Replace cmyfield With fncontent
	Endif

	Skip

Enddo 

How do I retain the fieldname in cfield so that I can get the contents of each field rather than the fieldname itself?
Thanks
GenDev

RE: garbling a name

(OP)
I have discovered the & and now it works as I wanted.

Gendev

RE: garbling a name

You now introduce two ways to decode the name again, once by using the scrambling with scrambling(fcontent,.F.) and once because you write original and scrambled values into a log.

If that's really the need, I go back to my suggestion of using a real strong encryption function instead of anything self-written:

Quote (myself)

If ... you want to encrypt and decrypt names, not just "garble" them, then better make use of crypto API or - what's simpler to use - vfpencryption.fll

https://www.sweetpotatosoftware.com/blog/index.php...

Especially, if it's about HIPAA compliance it's not just about "garbling" names somehow, but algorithms to be used are clearly specified.

I have to admit the last sentence was not checked well.

http://www.hipaajournal.com/hipaa-encryption-requi...

Quote (HIPAA Journal)

One of the reasons why the HIPAA encryption requirements are vague and open to interpretation is that, when the original Security Rule was enacted, it was acknowledged that technology advances. What may be considered appropriate encryption standards one day, may be inappropriate another.

Similar thoughts are valid for other domains, not only patient data. Due to the advance in such algorithms it's a valid thought to not get too specific. But that surely doesn't suggest "rolling your own". I recently just had the opposite of very strict defined specs for the cash register security regulation of austria, specifying each single steps to take to create signatures of receipts and QR codes of that signature.

Bye, Olaf.

RE: garbling a name

Don't use & with file names. Instead use parens. If you now have code like:

CODE

USE &cMyTable 

change it to:

CODE

USE (cMyTable) 

The & version will fail if cMyTable contains a path with a space in it.

Tamar

RE: garbling a name

Hi Gendev,

My scrambling function has two parameters:
Parameters tcIn, tlScramble
tcIn = the word to be scrambled / unscrambled
tlScramble = .T. to scramble , .F. to unscramble

to scramble all the fields of yourtable.field1:

CODE --> vfp

select yourtable
scan
replace field1 with scrambling(yourtable.field1,.t.)
endscan 

As Tamar correctly pointed out: avoid using the & (macrosubstitution)
The advise here by Olaf gives you an encrypted value of your field which is not the same as a scrambled value, an encrypted value you will not be able to index logicaly as required.
A scrambled value is not at all an encrypted value and not to be used to encrypt it is just for 'peeking' eyes not to be able to read the content without the aid of a tool.

Regards,

Koen

RE: garbling a name

Quote (Koen Piller)

tlScramble = .T. to scramble , .F. to unscramble
You can also do it inversely, though, as the algorithm is symmetric.

CODE

lcScrambled = scrambling("Olaf",.F.) && HWÏV
? lcScrambled
? scrambling(lcScrambled,.T.) && Olaf 

By the way, your constants still are broken, your latest version of DECRYPTY has 2x "ü" and I just stumbled on this using the "Olaf" example. Every character must be unique in both strings, or you get a nonreversable mapping.

Bye, Olaf.

RE: garbling a name

And strongm,

in regard of your example of "laËÊRôçqG" non-decryptable with the original or changed constants, you don't prove much. If I would change the mapping done on the base64 encoding and choose any 64 other characters in a scrambled order and gave you a single example of my renewed encoding, you also couldn't decode it. base64 encoding would still just be an encoding. And even a weak encryption can cause a quite unsolvable problem for a short ciphered text, you can't attack it statistically, for example. If I told you I cesar ciphered a a single letter to "k", you also wouldn't know what letter I encoded, if I never tell you when you guessed right.

Besides I already said this can be seen as "a symmetric cryptographic encryption with its key given within the code". You have not revealed your constant change, so you have kept your key secret. If gendev would want to do the same, he would still need to provide the code to end users and couldn't use a constant, especially the ENCRYPTY would need to be stored somewhere encrypted itself to make this encryption safer, but that would have the ironic situation of the key being stronger encrypted as the data.

In essence anyway: Any code mapping set1->set2 and inversely set2->set1 is not an encryption but simply an encoding. All characters are always encoded the same. Unless you didn't break this attribute of 1:1 mapping in your choice of constant values, I can say one thing about "laËÊRôçqG" - that all original characters differ from each other. If you don't cheat and would encrypt "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" for me, the result would give me the decoding constant I need for letters, anything I could enter and let be encrypted gives me the inverse translation of anything written with the same letters, so that's really weak. Say I am a user and enter a record with a name "a bc def ghij" and later get at the encrypted data, I could spot the encryption of it merely by the pattern "* ** *** ****". do that several times and I have all the necessary info to decode.

So gendev, I stay with my advice to not use that, if your customer really needs a strong encryption, this isn't.

If you have any trouble with VFP encryption, let me know. I have stumbled upon errors it throws when using it wrong. For example using it with AES needs a 32-byte key, any other length needs to be padded to that. The same goes with other parameters. But the FLL works used correctly, the descriptions of the ENCRYPT and DECRYPT function explain the necessary sizes.

RE: garbling a name

>If I told you I cesar ciphered a a single letter to "k", you also wouldn't know what letter I encoded

Absolutely, since under those specific circumstances, what you have is a one-time pad ... which is pretty much acknowledged to be an unbreakable encryption

>If I would change the mapping done on the base64 encoding and choose any 64 other characters in a scrambled order

Then it wouldn't be Base64, would it? winky smile

>you don't prove much.

The point was to demonstrate - as you yourself had said at least twice in this thread - that the key was critical. Yes, including the key means that the encryption is only being used to carry out encoding (again, as I thought we'd already agreed), but that does not change the fact that the algorithm is one of encryption.

If we were to accept your argument as being valid, then AES, already mentioned several times in this thread, suffers the same issue. If I know the algorithm (heck it is in the public domain, so I do) and I know the key being used, then for me all AES is effectively doing is simple encoding, since I can very easily move backwards and forwards between plaintext and ciphertext (using that known key the same chunk of plaintext will always deliver the exact same chunk of ciphertext and vice versa, i.e. your set 1 > set 2 stuff, although a block mapping rather than a character/byte mapping). But that really, really doesn't mean that the AES is suddenly not carrying out encryption.

Perhaps what we can say is that, at root, encryption is encoding, but that not all encoding is encryption.

>If you don't cheat and would encrypt "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" for me, the result would give me the decoding constant I need for letters

Here you go: LZúûtcxv8V09MÔÚ7jel@îÎÓùAèÊôaGqgFJ6dÉçRÄ5u-ËXâÂéàrUn
So now you can carry out a classic bit of cryptanalysis, using a known plaintext attack ... smile

RE: garbling a name

>Then it wouldn't be Base64, would it?
It would still be doing the same, just mapping to another set of 64 characters. It would still deserve the same name. It wouldn't be compatible. But it still just would be an encoding.

>then AES, already mentioned several times in this thread, suffers the same issue.

No, it doesn't convert the same letters to the same encrypted letter, there is a bigger block of 32Bytes per block and that's garbled in itself, that makes it impossible to build up a decrypting mapping. It's what makes it a different category of an algorithm, not just the password. So AES doesn't have that issue.

And in regard of known plaintext attack: An "encryption" not standing that attack, can't call itself encryption. That's the whole point. If I know a plaintext and know the algorithm and the encoding result, I should still not be able to deduct the key from that, which I can. You gave me at least the part of the ENCRYTPY contain to be able to decrytpt all letters, it worked and resulted in "Scrambled". AES is not flawed in that way at all.

Bye, Olaf.










RE: garbling a name

Olaf,

you are so correct!
there is a double ü in constant DECRYPTY!!
Once you start changing something one should take care not to create an other mistake.
I will review both constants and publish, after double checking, the revised.
Sorry for confusion caused.

Regards,
Koen

RE: garbling a name

>It would still deserve the same name

No, it wouldn't you know. Base64 is a standard, clearly defined in RFC 4648. If you use a different alphabet map than the one in the standard it is not base64. The standard itself is explicit about this, when discussing an alternative encoding: "This encoding should not be regarded as the same as the "base64" encoding and should not be referred to as only "base64".

(and if you do the research you'll also find that using the algorithm used for base64, but using alternative, secret maps is indeed considered encryption ...)

And perhaps this is where we differ. You appear to prefer to define things to your own requirements - "An 'encryption' not standing that attack, can't call itself encryption", which is pure nonsense. It may end up not being a very good or secure encryption, but it remains encryption nevertheless. Known-plaintext attacks (colloquially known as cribs) were used to break Enigma, for example, during WW2; are you suggesting that Enigma was not encryption?

>AES is not flawed in that way at all
Very true. AES is not susceptible to plaintext attack. Never suggested it was. That wasn't the point I was making. My point was that if you know the key then, given your assertion that "[a]ny code mapping set1->set2 and inversely set2->set1 is not an encryption but simply an encoding", even AES is simply an encoding (a complex encoding, but an encoding nevertheless). This is, of course, inevitable - every plaintext must map to a unique ciphertext, and vice versa (the whole thing would be pointless if they didn't), and in that sense is encoding. But just because it is encoding does not mean that it is not encryption. And hence my statement that encryption is encoding, but that not all encoding is encryption.

RE: garbling a name

We're going in circles here, any way you see a mapping as an encryption or not, this thing is only valid to hide names from the eyes of readers and gives a little hurdle to unmangle the "encrypted" names or more general data.

I think I now often enough stressed out the use of strong encryption is demanded by laws, so even if a customer is satisfied by something, you could save yourself from legal trouble coming back at you maybe years later, gendev. OWASP is very strong on this: https://www.owasp.org/index.php/Cryptographic_Stor...

Quote (OWAP)

Do not implement an existing cryptographic algorithm on your own, no matter how easy it appears.
This does not even address implementing your own algorithm, just your own implementation of a known algorithm.

For sake of having data in CSV or XML in only printable characters, you can do as XML does with binary data and embed it as base64 encoded, but there is no need to write a "scrambling" only using printed characters.

The original idea to simply generate random letter combinations, not at all influenced by the original name is already much safer, as it doesn't allow any retranslation aside of using the garble.log. If that's a specification you have your way back to the original names by access to this file only. If it really should be used for getting back the original names at a later stage, I'd put it in a DBF rather than a text log file with the three fields recno, name, randomstr. You would only need to store the randomstr, if you want to ensure this is the data you got from this run, otherwise, of course, knowing recno and original name is enough. On the other side, you could simply keep a copy of the original data to revert back to it, such a log is quite illogic, as it's just a verbose description of the original data.

Bye, Olaf.

RE: garbling a name


Hi,

According to my tests here are constants where all letters are unique.
Sorry for troubles caused.

#Define DECRYPTY "abcdefghijklmnopqrstuvwxyzABSCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#"
#Define ENCRYPTY "Ï3ÍTUVÂë56éWXYêËQÛÇBÉZaÚ4ÙbÌÎöfghijkó7pq8HIâÄÔáAsCÁ90òMwxyNOPÒRSîúùnorstûÜç@.-&#GÀèJKLôÖz12äDEFcdelmuvàÈÊïíìÓü"

Regards,

Koen

RE: garbling a name

You still have errors in there, Koen, check small and large S in both strings.

#Define DECRYPTY "abcdefghijklmnopqrstuvwxyzABSCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#"
#Define ENCRYPTY "Ï3ÍTUVÂë56éWXYêËQÛÇBÉZaÚ4ÙbÌÎöfghijkó7pq8HIâÄÔáAsCÁ90òMwxyNOPÒRSîúùnorstûÜç@.-&#GÀèJKLôÖz12äDEFcdelmuvàÈÊïíìÓü"

It's not hard to construct constatns, which will work, if you cover all CHR(i) for which ISALPHA() returns .T. and add in digits, punctuation and such also printable and wanted character for DECRYPTY, then simply randomly shuffle that string.

Here you have a simple check routine:
#Define DECRYPTY "abcdefghijklmnopqrstuvwxyzABSCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#"
#Define ENCRYPTY "Ï3ÍTUVÂë56éWXYêËQÛÇBÉZaÚ4ÙbÌÎöfghijkó7pq8HIâÄÔáAsCÁ90òMwxyNOPÒRSîúùnorstûÜç@.-&#GÀèJKLôÖz12äDEFcdelmuvàÈÊïíìÓü"

lcLine1 = DECRYPTY
lcLine2 = ENCRYPTY

IF CharsOnceInString(lcLine1, lcLine2) AND CharsOnceInString(lcLine2, lcLine1)
? "OK"
ENDIF

FUNCTION CharsOnceInString() as Boolean
LPARAMETERS tcChars, tcString

DO While LEN(tcChars)>0 AND LEN(CHRTRAN(tcString,LEFT(tcChars,1),"")) = LEN(tcString)-1
tcChars = SUBSTR(tcChars,2)
ENDDO

IF !EMPTY(tcChars)
? "problematic character "+LEFT(tcChars,1)+" occurs "+TRANSFORM(OCCURS(LEFT(tcChars,1), tcString))+" times in tcString"
ENDIF

RETURN (LEN(tcChars)=0)
ENDFUNC


Bye, Olaf.

Edit: Once you remove the upercase S from ABSCDEF in DECRYPTY and one of the small s in ENCRYPTY (doesn't matter which), the two constants are OK.

RE: garbling a name

Olaf,

Thanks for showing the bug and thanks for the code to compose.

Changed the constants and performed following check to display 2 identical lines :

code
clea
? "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#"
lc=scrambling("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890äáàâÄÁÀÂëéèêËÉÈÊïíìîÏÍÌÎöóòôÖÓÒÔüúùûÜÚÙÛÇç@.-&#",.t.)
? scrambling(m.lc,.f.)
endcode


Koen

RE: garbling a name

>simply randomly shuffle that string.

Shuffling DECRYPTY is certainly how I generated the ENCRYPTY that I used for my laËÊRôçqG 'challenge'. In fact, I dispensed with a fixed ENCRYPTY; it was generated on the fly using a password to seed the shuffle routine

RE: garbling a name

>using a password to seed the shuffle routine

I played with this and had best results using SHA512 to first create a more random seed value from a password. Of course that still makes weak passwords weak passwords, but depending on how you use the seed value for shuffling, you can get very weak shuffles. I first tried to use bytes as "cut" position to split the ENCRYPTY value and join the split string swapped. Doing it this way without frst hasing the passwords you get similar passwords also decrypting the encrypted text(!)

Also only shuffling initially still keeps the ability to demask the whole process with known plaintext.

The nature of what gendev does with automatic "garbling" and "ungarbling" is needing a fixed password or key and that makes it a long term target. Doing it like it's done with password hashs storing a sand or key per record, you give away the necessary data to "ungarble", too.

It's simply a pity gendev still keeps it at using this routine.

Bye, Olaf.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Resources

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close