


Wrong data for datatype Your data must match the format required by the relevant PostgreSQL data type. Note that you cannot mix NULL representations! This is subject to the Caveat with implementation on nulls. Otherwise, COPY will assume empty fields represent empty strings, and will bomb on the first empty number or date field. If you have empty fields in your data, I strongly recommend using WITH NULL AS ''. You can change this by using WITH NULL AS 'something_else'. NULL confusion COPY expects NULLs to be represented as "\N" (backslash-N) by default. The command for that is recode /cl datafile. Use a program like GNU recode, which will alter only the CRs used in the line delimiter.Simply, delete all of the the carriage returns using a simple script: tr -d '\r' This will automatically adjust the line endings for you. If you received the datafile from an ftp server, you can avoid these altogether by using the "ascii" transfer method.If the field is a character data type, your COPY will succeed, and you will then be scratching your head trying to work out why comparisons using that field give such strange results. If that field is a number or date data type, your COPY will fail. Carriage return characters (CR) If the CR is not removed, it will end up in the final field of your table. All the fields get shifted down by one, with the result that (a) your COPY fails because of a data type mismatch, or (b) your data is silently accepted in a mangled state. Backslash characters This means trouble, because the following delimiter is thereby escaped, and no longer recognized as a delimiter. Most often this will show as a data type mismatch, as COPY attempts to stuff your text string into a date field or something similar. If you have extra delimiter characters, COPY will find too many fields to fit your table, and. COPY expects tabs as delimiters by default, but you can specify something else with "USING DELIMITERS 'whatever'". Work to fix this has been going under the term "ragged csvs" Embedded delimiters This is all too common, especially with user-created memo text.

Postgresql provides no way to ignore this column in the load - CSV LOAD requires every csv in the feed to be present in the table you're loading it too. Here are some problems you expect with copy:Ĭopying from a source with excess columns Let's say you're copying from a source that has the fields (Foo, Bar, Foo+Bar) when you're copying you realize that Foo+Bar is stupid when it is a function of Foo, and Bar. Many of these can come after the CSV, example, WITH CSV NULL AS is perfectly permissible.ĬOPY is not terribly smart or clever, in fact it is dumb and simple. The syntax for \COPY is slightly different: (a) being a psql command, it is not terminated by a semicolon (b) file paths are relative the current working directory. using (var writer = conn.Here is the syntax for COPY, as returned by the 8.3 client:ĭescription: copy data between a file and a tableĬOPY This mode is less efficient than binary copy, and is suitable mainly if you already have the data in a CSV or compatible text format and don't care about performance. It is the user's responsibility to format the text or CSV appropriately, Npgsql simply provides a TextReader or Writer. This mode uses the PostgreSQL text or csv format to transfer data in and out of the database. Reader.StartRow() // Last StartRow() returns -1 to indicate end of data Using (var reader = Conn.BeginBinaryExport("COPY data (field_text, field_int2) TO STDOUT (FORMAT BINARY)"))Ĭonsole.WriteLine(reader.Read(NpgsqlDbType.Smallint)) Ĭonsole.WriteLine(reader.IsNull) // Null check doesn't consume the column Using (var writer = conn.BeginBinaryImport("COPY data (field_text, field_int2) FROM STDIN (FORMAT BINARY)")) It is also highly recommended to use the overload of Write() which accepts an NpgsqlDbType, allowing you to unambiguously specify exactly what type you want to write. It is the your responsibility to read and write the correct type! If you use COPY to write an int32 into a string field you may get an exception, or worse, silent data corruption.
