Page 1 of 1

File with UTF-8 without BOM

Posted: 07 Feb 2010 19:56
by omata
File: SynUnicode.pas

Without BOM the File was read as ANSI. Change the following and you can read this case correctly...

Code: Select all

procedure TUnicodeStrings.LoadFromStream(Stream: TStream);
:
    // default case (Ansi)
    if not Loaded then
    begin
      FSaveFormat := sfAnsi;
      SetLength(SA, Size div SizeOf(AnsiChar));
      if BytesRead > 0 then
      begin
        System.Move(ByteOrderMask[0], SA[1], BytesRead); // max 6 bytes = 6 chars
        if Size > BytesRead then
          Stream.Read(SA[7], Size - BytesRead); // first 6 chars were copied by System.Move
        SW := UTF8Decode(SA);
        if SW <> '' then begin
          FSaveFormat := sfUTF8;
          SetTextStr(SW);
          Loaded := True;
        end;
      end;
      if not Loaded then
        SetTextStr(SA);
    end;

Re: File with UTF-8 without BOM

Posted: 08 Feb 2010 19:32
by Maƫl
For saving/loading files as UTF-8, UTF-16 or Ansi, see LoadFromFile/SaveToFile functions in SynUnicode.pas. Don't use TUnicodeStrings.LoadFromFile/SaveToFile. Adding your code works, but would be slow for large file, it's faster to first detect if there might be an UTF-8 sequence in beginning of the file and then specify the encoding as parameter in LoadFromFile.

As the behavior should match the default Delphi behavior though, I won't add it to SynEdit.

Re: File with UTF-8 without BOM

Posted: 09 Feb 2010 20:31
by omata
I'm sorry.

Please delete my account.

Re: File with UTF-8 without BOM

Posted: 09 Feb 2010 20:32
by omata
...