File with UTF-8 without BOM

Post Reply
omata
Posts: 5
Joined: 07 Feb 2010 19:50

File with UTF-8 without BOM

Post by omata »

File: SynUnicode.pas

Without BOM the File was read as ANSI. Change the following and you can read this case correctly...

Code: Select all

procedure TUnicodeStrings.LoadFromStream(Stream: TStream);
:
    // default case (Ansi)
    if not Loaded then
    begin
      FSaveFormat := sfAnsi;
      SetLength(SA, Size div SizeOf(AnsiChar));
      if BytesRead > 0 then
      begin
        System.Move(ByteOrderMask[0], SA[1], BytesRead); // max 6 bytes = 6 chars
        if Size > BytesRead then
          Stream.Read(SA[7], Size - BytesRead); // first 6 chars were copied by System.Move
        SW := UTF8Decode(SA);
        if SW <> '' then begin
          FSaveFormat := sfUTF8;
          SetTextStr(SW);
          Loaded := True;
        end;
      end;
      if not Loaded then
        SetTextStr(SA);
    end;
Maël
Site Admin
Posts: 1454
Joined: 12 Mar 2005 14:15

Re: File with UTF-8 without BOM

Post by Maël »

For saving/loading files as UTF-8, UTF-16 or Ansi, see LoadFromFile/SaveToFile functions in SynUnicode.pas. Don't use TUnicodeStrings.LoadFromFile/SaveToFile. Adding your code works, but would be slow for large file, it's faster to first detect if there might be an UTF-8 sequence in beginning of the file and then specify the encoding as parameter in LoadFromFile.

As the behavior should match the default Delphi behavior though, I won't add it to SynEdit.
omata
Posts: 5
Joined: 07 Feb 2010 19:50

Re: File with UTF-8 without BOM

Post by omata »

I'm sorry.

Please delete my account.
omata
Posts: 5
Joined: 07 Feb 2010 19:50

Re: File with UTF-8 without BOM

Post by omata »

...
Post Reply