×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Regex of more than 1 character not recognised

Regex of more than 1 character not recognised

Regex of more than 1 character not recognised

(OP)
Hello,
I am trying to filter a text file of browsing history generated using chromehistoryview. A sample of the format:

==================================================
URL : https://mountainlaureldesigns.com/
Title : Ultra Light Tents, Tarps, Bivys, Packs & Gear | Mountain Laurel Designs | Super Ultra Light Equipment for Outdoor & Wilderness
Visited On : 9/6/2022 1:29:39 AM
Visit Count : 1
Typed Count : 0
Referrer : https://www.google.com/search?q=mountain+laurel+de...
Visit Duration :
Visit ID : 62
Profile : Default
URL Length : 34
Transition Type : Link
Transition Qualifiers: Chain Start,Chain End
History File : C:\Users\subla\AppData\Local\Google\Chrome\User Data\Default\History
==================================================

For some reason, no regexes longer than 1 character are being matched in this file, e.g., if I enter /U/ it matches no problem, but if I enter /URL/ it matches nothing at all. This seems to be an issue specific to this file, since when I cut and paste any few entries into another file, regexes behave as expected. Is there something about the text format here that would be causing this? It's very baffling. Thanks.

UPDATE: Ok, so this appears to have something to do with the text encoding. UTF 16 encodings seem to be causing the problem, but UTF 8 encodings are behaving as expected. Does anyone know why this is? Cheers.


RE: Regex of more than 1 character not recognised

On what OS are you running awk (Linux, Windows, Mac) ?

RE: Regex of more than 1 character not recognised

..btw, you could post the relevant part of the awk script, you have done so far..

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close