HxD: Search with Regex

Wishlists for new functionality and features.
Post Reply
Schtrudel
Posts: 13
Joined: 08 Jan 2006 10:48

HxD: Search with Regex

Post by Schtrudel » 07 Jun 2006 09:20

Hi Mael!

Here is a feature a colleague asked me about:
He wants to search in a binary file, through a binary regular expression. For example, to search a pattern like:
34 4? 57
and gets all the pattern matches with the second nibble as "don't care".
If it's possible to get down to the bit level, that would be fantastic (e.g. ignoring 3 bits and not a whole nibble).

Thanks

Schtrudel

Maël
Site Admin
Posts: 1129
Joined: 12 Mar 2005 14:15

Post by Maël » 08 Jun 2006 15:14

Hi Schtrudel,
As really flexible regular expressions (with some sugar) mean that I have to roll my own implementation, expect this feature not before version 1.8 or 1.9.

Regards, Maël.

Schtrudel
Posts: 13
Joined: 08 Jan 2006 10:48

Post by Schtrudel » 08 Jun 2006 15:32

Maël wrote:Hi Schtrudel,
As really flexible regular expressions (with some sugar) mean that I have to roll my own implementation, expect this feature not before version 1.8 or 1.9.

Regards, Maël.
Thanks,

I'll wait... :roll:

Schtrudel

b0ne
Posts: 5
Joined: 02 Mar 2007 18:40

Post by b0ne » 02 Mar 2007 19:19

Maël wrote:As really flexible regular expressions (with some sugar) mean that I have to roll my own implementation, expect this feature not before version 1.8 or 1.9.
I've never really known a use for matching 4bits out of a byte, not that there aren't people out there who might.

May I recommend using the PCRE regex library instead of spending a lot of time implementing of your own "regex style" search functionality?

It should be fairly simplistic to convert a search syntax like this:
81 8F E3 F7 9E . . . . 9F FF 8E 87 6F 47 87
(or maybe double dots such that the spacing remains the same?)

To something like this:
\x81\x8F\xE3\xF7\x9E....\x9F\xFF\x8E\x87\x6F\x47\x87

I use regular expressions to search files on machine code binaries on a regular basis, but none of those tools are a hex editor, so it would be very nice to integrate that functionality into this particular hex editor. (Fast and light weight = good i.m.o.)

Other examples of useful syntax that is PCRE compatible :

Code: Select all

.*
(matches all bytes until next specified match)

.{1,200}
(matching any byte minimum of 1 times, maximum of 200)

[90-F0] converts to [\x90-\xF0]
(matching bytes from 90 to F0)

(90|FF|55) converts to (\x90|\xFF|\x55)
(matching bytes, 90, FF, or 55)

(?!55) converts to (?!\x55)
(next byte does not equal 55)
All HxD needs to do is recognize bytes and convert it to the regular expression format.

The Function pcre_compile() will tell you if the notation is bad, and pcre_exec() returns the address of the match plus the length, so it would be easy to update the selection in the editor.

This makes the matching engine extremely flexible, and forces the user to simply be familiar with regular expression notion without the ugly syntax. (\x00)

Maël
Site Admin
Posts: 1129
Joined: 12 Mar 2005 14:15

Post by Maël » 03 Mar 2007 18:09

I don't know if there is really a need for a full blow perl regular expressions. My first thought was more to implement "real" regular expressions (i.e. what can be done with an DFA) with an slightly extended syntax that allows to specify byte patterns. AFAICT all your examples would be achievable this way.
b0ne wrote:The Function pcre_compile() will tell you if the notation is bad, and pcre_exec() returns the address of the match plus the length, so it would be easy to update the selection in the editor.
The issue with this is that I allow searching streams of any size (also GBs of data) which means I have to adapt the searching (as not everything can be held in memory). I will first have to investigate how much I would need to adapt the PCRE lib to achieve this. This also means I can't simply use the code, but will have to analyze and understand it.

b0ne wrote:I use regular expressions to search files on machine code binaries on a regular basis
Could you give me some links to those tools, so I can see how they handle binaries?

b0ne
Posts: 5
Joined: 02 Mar 2007 18:40

Post by b0ne » 06 Mar 2007 17:12

The issue with this is that I allow searching streams of any size (also GBs of data) which means I have to adapt the searching (as not everything can be held in memory).
Perhaps a checkbox to enable "full" regular expressions that will scan the full buffer with the understanding that it cannot be performed on giant streams? It doesn't seem wise to use complicated regular expressions on GBs of data to begin with...

Alternatively, a condition where if the scan is going to be extremely large, to allow only regular expressions that have bounds on their repetitions such that you can overlap the buffers that are provided to PCRE?

For instance, you have read in 4 blocks of memory, each 32 bytes in size. You know your pattern matches a maximum of 16 bytes.

The pattern cannot match across blocks 1 to 3 or 2 to 4. Knowing this you could free 1 after you've scanned across 1 and 2, then free 2 after you've scanned across 2 and 3.
Could you give me some links to those tools, so I can see how they handle binaries?
Well, some are closed source like PowerGrep, it handles binary searches (not very well) and is closed source. There are tools I use at my job which utilize PCRE, but those aren't hex editors... :) I was privy to the implementation of the internal ones, all of them rely on passing the entire buffer to PCRE.

I haven't investigated clamav's source, but they have a byte-pattern matcher in their scanner which supports variable sized patterns on unknown file sizes.

Maël
Site Admin
Posts: 1129
Joined: 12 Mar 2005 14:15

Post by Maël » 06 Mar 2007 18:07

b0ne wrote:Well, some are closed source like PowerGrep, it handles binary searches (not very well) and is closed source. There are tools I use at my job which utilize PCRE, but those aren't hex editors... :) I was privy to the implementation of the internal ones, all of them rely on passing the entire buffer to PCRE.
I was more interested in regular expression syntax. I can't use most opensource anyway as HxD isn't opensource either. Anyway, sourcecode isn't the issue, I was more interested in design of regular expressions for binary files (and their special requirements).

b0ne
Posts: 5
Joined: 02 Mar 2007 18:40

Post by b0ne » 28 Apr 2007 02:49

Maël wrote:I was more interested in regular expression syntax. I can't use most opensource anyway as HxD isn't opensource either. Anyway, sourcecode isn't the issue, I was more interested in design of regular expressions for binary files (and their special requirements).
Sorry for the extremely delayed response. PCRE is BSD licensed, so all you need to do is throw some copyright info into a text file and call it good. The source is available for study as well.

zespri

Re: HxD: Search with Regex

Post by zespri » 06 Aug 2008 00:21

I personally both hands up for a simple pattern search, even if this is not regex. I really miss an ability to search for stuff like "66 ?? 5D ?? ?? 7C". This particular thing will be not that hard to implement, although I understand that it's less flexible than Regex, time-to-market can be considerable lower. :D

Maël
Site Admin
Posts: 1129
Joined: 12 Mar 2005 14:15

Re: HxD: Search with Regex

Post by Maël » 06 Aug 2008 16:08

I already started implementing something based on PCRE. Doing just a partial implementation and I'll get a lot of complaints why the pattern search is so limited. But I can tell you that it is very high on the TODO-list.

AHUser
Posts: 2
Joined: 28 May 2009 21:19

Hex search with gaps

Post by AHUser » 28 May 2009 21:58

I'd like to search a file where I know the start and the end of a byte sequence. This is mostly the case when I do binary patching of executable or DLLs where fixup values are used for addresses.

A search hex-string like "85 F6 74 0C 83 0D xx xx xx xx 04" could be used to find the position.

Maël
Site Admin
Posts: 1129
Joined: 12 Mar 2005 14:15

Re: Hex search with gaps

Post by Maël » 30 May 2009 17:15

There is a similar request to add regular expressions: http://forum.mh-nexus.de/viewtopic.php?f=4&t=80&start=0

It should be possible to do what you want when regexes are implemented.

AHUser
Posts: 2
Joined: 28 May 2009 21:19

Re: Hex search with gaps

Post by AHUser » 31 May 2009 09:40

I'm looking forward to that. Thanks.

htdug
Posts: 1
Joined: 11 Aug 2018 13:50

Re: HxD: Search with Regex

Post by htdug » 11 Aug 2018 13:58


Maël
Site Admin
Posts: 1129
Joined: 12 Mar 2005 14:15

Re: HxD: Search with Regex

Post by Maël » 06 Feb 2019 11:15


Post Reply