HxD 2.5.0.0: Search for "00" or binary 00 00

Bug reports concerning HxD.
Post Reply
asdvw43
Posts: 10
Joined: 29 Jun 2022 19:54

HxD 2.5.0.0: Search for "00" or binary 00 00

Post by asdvw43 »

Searching for "two of the same" (text "00", "99" and also binary 00 00, 39 39 etc) results in way lower count than what's true.
Tested with 1MB sized files containing numbers "0" - "9" as text, also as binary 00 - 09
Maël
Site Admin
Posts: 1455
Joined: 12 Mar 2005 14:15

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by Maël »

Please provide the test file and the exact search pattern you used to reproduce the error easily.
Additionally what was the expected count and the count you saw?
asdvw43
Posts: 10
Joined: 29 Jun 2022 19:54

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by asdvw43 »

The two files in the attached zip contains 1MB/1048576 decimals of Pi, one in text format, other binary.

I got the expected ~10% of each decimal but roughly 0.908% of any "two of the same".

I wrote a function myself that shows the "two of the same" is the expected ~1%

search the files for "00" or 00 00 (text/binary), 9518 hits, all "two of the same" show similar numbers
search for combinations, "01", "02", "03" etc, 10485 hits would be exactly 1% and that's roughly what is shown (I compared my app's results with HxD: the same)

EDIT //

I suspect your function jumps too far to next comaprison, it should only jump ONE byte. I think it works this: "00" is found, then jumps TWO(or whole search length) bytes, but "000" can also exist, "00x" is found but "x00" won't be found that way.
Attachments
h100000 pi decimals.zip
(965.07 KiB) Downloaded 389 times
asdvw43
Posts: 10
Joined: 29 Jun 2022 19:54

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by asdvw43 »

These are the results I get with my function. The only differing results are those two of the same, 00, 11, 22 - 99, which HxD finds 91% of

00 10436
01 10363
02 10532
03 10542
04 10435
05 10490
06 10361
07 10458
08 10615
09 10557
10 10522
11 10498
12 10205
13 10493
14 10348
15 10474
16 10440
17 10418
18 10543
19 10647
20 10369
21 10631
22 10596
23 10386
24 10530
25 10453
26 10343
27 10756
28 10471
29 10404
30 10556
31 10435
32 10529
33 10510
34 10450
35 10670
36 10440
37 10577
38 10445
39 10508
40 10481
41 10460
42 10466
43 10484
44 10431
45 10499
46 10687
47 10519
48 10397
49 10528
50 10530
51 10304
52 10571
53 10519
54 10665
55 10771
56 10458
57 10554
58 10407
59 10494
60 10605
61 10382
62 10559
63 10467
64 10468
65 10506
66 10305
67 10349
68 10351
69 10360
70 10383
71 10574
72 10401
73 10594
74 10461
75 10485
76 10574
77 10323
78 10514
79 10382
80 10513
81 10437
82 10564
83 10577
84 10444
85 10417
86 10421
87 10380
88 10604
89 10527
90 10394
91 10503
92 10516
93 10548
94 10721
95 10508
96 10323
97 10357
98 10537
99 10580
asdvw43
Posts: 10
Joined: 29 Jun 2022 19:54

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by asdvw43 »

Binary search for 00 00
Bytes 00, 00, 00 exist but not all are found
Bytes 00, 00, 00 exist but not all are found
HxD 00 00 binary search.jpg (191.43 KiB) Viewed 16361 times
Maël
Site Admin
Posts: 1455
Joined: 12 Mar 2005 14:15

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by Maël »

Thanks for the test data.

I opened the file "h100000 pi decimals.txt" in Notepad++. Then did a text search for "00" (there is an option where you can count the occurences instead of listing them). The count is 9518.

However, if I understand your post correctly, 00 should appear 10436 times.

HxD also has a result of 9518 when doing a text search for "00" in "h100000 pi decimals.txt".

Maybe you expect search to work differently:
Searching "000" for "00" should have an occurence count 2, not 1, right? But that's not how search is supposed to work. It will always continue searching after the end of the last found item.
Maël
Site Admin
Posts: 1455
Joined: 12 Mar 2005 14:15

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by Maël »

asdvw43 wrote: 30 Jun 2022 08:00 I suspect your function jumps too far to next comaprison, it should only jump ONE byte. I think it works this: "00" is found, then jumps TWO(or whole search length) bytes, but "000" can also exist, "00x" is found but "x00" won't be found that way.
It does work like that, after the pattern is found it continues searching after the end of the found occurence. It does not continue searching after the first byte of the occurence.
But that's not too far, it's how it is meant to work, see Notepad++ and search in most programs.
Maël
Site Admin
Posts: 1455
Joined: 12 Mar 2005 14:15

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by Maël »

So it's not a bug, but you could make it a feature request. I wouldn't know what this kind of search would be called though.

You could say byte search, but that's too vague and not accurate, because for text the basic unit is not necessarily a byte, depending on the encoding.

Edit: a few experiments with possible wordings/descriptions.
Search GUI could be expanded with an additional groupbox, something like that, but shorter:
Search for next match after:
  • End of current match (no overlap, default)
  • First byte of current match (overlapping)
Overlapping of matches:
  • No overlap (default)
  • Overlap (match bytes can overlap, except first)
Overlapping of matches:
  • No overlap (default)
  • Overlap (matches offset by one byte min.)
Overlapping of matches:
  • No overlap (default)
  • Overlap (min. one byte offset)
Overlapping of matches:
  • Never overlap (default)
  • Can overlap (min. one byte offset)
Search match extents:
  • Never overlap (default)
  • May overlap (min. one byte offset)
Extents of finds:
  • Never overlap (default)
  • May overlap (min. one byte offset)
Extents of finds may:
  • Never overlap (default)
  • Overlap (min. one byte offset)
Finds' extents may:
  • Never overlap (default)
  • Overlap (min. one byte offset)
Matches can overlap:
  • Never (default)
  • Possible (min. one byte offset)
Overlapping of matches:
  • Never (default)
  • Allowed (min. one byte offset)
My current favorite is the 9th or 8th option.

So you could call it overlapping search, but I am not sure if that's an established term.
asdvw43
Posts: 10
Joined: 29 Jun 2022 19:54

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by asdvw43 »

This will make it clear I hope.
00 00 00 00 00 = 4 hits.jpg
00 00 00 00 00 = 4 hits.jpg (117.12 KiB) Viewed 16317 times
Btw, Hexadecimal I've meant, not binary as in 00101001 :)
Maël
Site Admin
Posts: 1455
Joined: 12 Mar 2005 14:15

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by Maël »

Yes, it's clear :) Please see my other posts.
asdvw43
Posts: 10
Joined: 29 Jun 2022 19:54

Re: HxD 2.5.0.0: Search for "00" or binary 00 00

Post by asdvw43 »

There could be another checkbox in Search:

By Occurrence (unchecked by default, meaning "By Length")
Post Reply