[FEATURE REQUEST] Descriptive error message with exception when having invalid data in BULK INSERT #2610

atul-delphix · 2025-02-19T11:38:03Z

Is your feature request related to a problem? If so, please give a short summary of the problem and how the feature would resolve it

We are consuming, mssql-jdbc driver version 12.4.2.jre11 with mssql-spark-connector version 3.1.

And while loading the data from CSV to target database with some invalid data (i.e length of data in csv file exceeds than column length in target database's table) then it failed with generic error message Bulk load data was expected but not sent. The batch will be terminated..

Through its is almost impossible to diagnose the RCA of failure, please refer the sample stacktrace in case of such failure:

08-08-2024 05:46:40.107 [task-result-getter-1] WARN  o.a.spark.scheduler.TaskSetManager.logWarning - Lost task 0.0 in stage 189.0 (TID 189) (af5307e43913 executor driver): com.microsoft.sqlserver.jdbc.SQLServerException: Bulk load data was expected but not sent. The batch will be terminated.
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:261)
	at com.microsoft.sqlserver.jdbc.TDSTokenHandler.onEOF(tdsparser.java:316)
	at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:137)
	at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:42)
	at com.microsoft.sqlserver.jdbc.TDSParser.parse(tdsparser.java:31)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection$1ConnectionCommand.doExecute(SQLServerConnection.java:4533)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7748)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:4410)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectionCommand(SQLServerConnection.java:4541)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.rollback(SQLServerConnection.java:4747)
	at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.savePartition(BulkCopyUtils.scala:68)
	at com.microsoft.sqlserver.jdbc.spark.SingleInstanceWriteStrategies$.$anonfun$write$2(BestEffortSingleInstanceStrategy.scala:43)
	at com.microsoft.sqlserver.jdbc.spark.SingleInstanceWriteStrategies$.$anonfun$write$2$adapted(BestEffortSingleInstanceStrategy.scala:42)
	at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1009)
	at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2$adapted(RDD.scala:1009)
	at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2303)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
	at org.apache.spark.scheduler.Task.run(Task.scala:139)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)

Describe the preferred solution

Error should properly point against which table's and column having issue with specific data. For example),

 String or binary data would be truncated in table 'SourceDB2019.dbo.test_table_target', column 'char_col_with_2_length'. Truncated value: 'lo'

Describe alternatives you've considered

Additional context

Reference Documentations/Specifications

Reference Implementation

The text was updated successfully, but these errors were encountered:

machavan · 2025-02-20T12:25:56Z

Hi @atul-delphix

When I tried to reproduce the issue with a code snippet as below

        String url = "jdbc:sqlserver://instance;database=db;encrypt=true;trustServerCertificate=true;username=user;password=password";

        Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");

        String csvFilePath = "C:\\Users\\Bob\\Downloads\\test.csv"; 
        String destinationTable = "dbo.VARCHAR_TABLE"; 

        int countBefore, countAfter;

        // Get data from the source file by loading it into a class that implements ISQLServerBulkRecord.
        // Here we are using the SQLServerBulkCSVFileRecord implementation to import the example CSV file.
        try (Connection destinationConnection = DriverManager.getConnection(url);
                Statement stmt = destinationConnection.createStatement();
                SQLServerBulkCopy bulkCopy = new SQLServerBulkCopy(destinationConnection);
                SQLServerBulkCSVFileRecord fileRecord = new SQLServerBulkCSVFileRecord(csvFilePath, true);) {
            
            fileRecord.setEscapeColumnDelimitersCSV(true);
            // Set the metadata for each column to be copied.
            fileRecord.addColumnMetadata(1, null, java.sql.Types.INTEGER, 0, 0);
            fileRecord.addColumnMetadata(2, null, java.sql.Types.VARCHAR, 10, 0);
            fileRecord.addColumnMetadata(3, null, java.sql.Types.VARCHAR, 10, 0);

            // Empty the destination table.
            stmt.executeUpdate("DELETE FROM " + destinationTable);

            // Perform an initial count on the destination table.
            countBefore = getRowCount(stmt, destinationTable);

            // Set up the bulk copy object.
            // Note that the column positions in the source
            // data reader match the column positions in
            // the destination table so there is no need to
            // map columns.
            bulkCopy.setDestinationTableName(destinationTable);

            // Write from the source to the destination.
            bulkCopy.writeToServer(fileRecord);

            // Perform a final count on the destination
            // table to see how many rows were added.
            countAfter = getRowCount(stmt, destinationTable);
            System.out.println((countAfter - countBefore) + " rows were added.");
        }
        // Handle any errors that may have occurred.
        catch (SQLException e) {
            e.printStackTrace();
        }    
   }

with csv contents like this

1,Hi,Hi
2,Hello,Hello
3,There,Thereeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee

It throws an appropriate error message

com.microsoft.sqlserver.jdbc.SQLServerException: The given value of type VARCHAR(132) from the data source cannot be converted to type varchar(10) of the specified target column COL2.

Would you like to suggest changes to this repro so as to get an error that you encountered ?

github-project-automation bot added this to MSSQL JDBC Feb 19, 2025

github-project-automation bot moved this to To be triaged in MSSQL JDBC Feb 19, 2025

machavan self-assigned this Feb 20, 2025

machavan added the Under Investigation Used for issues under investigation label Feb 20, 2025

machavan added the Waiting for Response Waiting for a reply from the original poster, or affiliated party label Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Descriptive error message with exception when having invalid data in BULK INSERT #2610

[FEATURE REQUEST] Descriptive error message with exception when having invalid data in BULK INSERT #2610

atul-delphix commented Feb 19, 2025 •

edited

Loading

machavan commented Feb 20, 2025

[FEATURE REQUEST] Descriptive error message with exception when having invalid data in BULK INSERT #2610

[FEATURE REQUEST] Descriptive error message with exception when having invalid data in BULK INSERT #2610

Comments

atul-delphix commented Feb 19, 2025 • edited Loading

Is your feature request related to a problem? If so, please give a short summary of the problem and how the feature would resolve it

Describe the preferred solution

Describe alternatives you've considered

Additional context

Reference Documentations/Specifications

Reference Implementation

machavan commented Feb 20, 2025

atul-delphix commented Feb 19, 2025 •

edited

Loading