SAP: how to read a CSV file correctly in ABAP

CSV files are still an important file format. But how can you read such a file in ABAP?

The backbone for processing a CSV file in ABAP is the function module RSDS_CONVERT_CSV. It splits a single line of a CSV file into its individual elements, handling quotation marks correctly (escape handling).

First, I’ll show you the complete code of the program. Then, we’ll explore why we need the function module and why it’s not enough to split the lines into fields with a split command.

Additionally, I’ll also explain how the framework of the program works, including the access to the file system.

This is the code for reading a CSV file in ABAP

Here is the complete code:

*&---------------------------------------------------------------------*
*& Report ZREADCSV
*&---------------------------------------------------------------------*
*&
*&---------------------------------------------------------------------*
REPORT zreadcsv.

PARAMETERS: p_file TYPE string.

TYPES: tt_string TYPE TABLE OF string.


AT SELECTION-SCREEN ON VALUE-REQUEST FOR p_file.

  PERFORM at_selection_screen_val_file CHANGING p_file.


START-OF-SELECTION.

  " check, whether file file exists:
  DATA: lf_ok TYPE abap_bool.

  PERFORM check_file_exists USING p_file
                         CHANGING lf_ok.

  IF lf_ok <> abap_true.
    WRITE: / 'file does not exist:', p_file.
    RETURN.
  ENDIF.

  PERFORM read_file USING p_file.


  "--------------------------
FORM at_selection_screen_val_file
     CHANGING cf_filename TYPE string.

  DATA:
    lt_filetable TYPE filetable,
    ls_filetable TYPE file_table,
    lf_action    TYPE i,
    lf_return    TYPE i.

  CALL METHOD cl_gui_frontend_services=>file_open_dialog
    EXPORTING
*     window_title            =
      default_extension       = 'CSV'
      file_filter             = '*.CSV'
*     with_encoding           =
    CHANGING
      file_table              = lt_filetable
      rc                      = lf_return
      user_action             = lf_action
*     file_encoding           =
    EXCEPTIONS
      file_open_dialog_failed = 1
      cntl_error              = 2
      error_no_gui            = 3
      not_supported_by_gui    = 4
      OTHERS                  = 5.
  IF sy-subrc <> 0.
    " error handling
    MESSAGE ID   sy-msgid TYPE sy-msgty NUMBER sy-msgno
            WITH sy-msgv1 sy-msgv2 sy-msgv3 sy-msgv4.
    RETURN.
  ENDIF.
  " OK => use filename
  IF    lf_action = cl_gui_frontend_services=>action_ok
    AND lf_return = 1.
*     Determine file name
    READ TABLE lt_filetable INTO ls_filetable INDEX 1.
    IF sy-subrc IS INITIAL.
*       take over file name
      cf_filename = ls_filetable-filename.
    ENDIF.
  ENDIF.

ENDFORM.                    " AT_SELECTION_SCREEN_VAL_FILEI

FORM check_file_exists USING uf_filename TYPE string
                       CHANGING cf_ok TYPE abap_bool.

  DATA:
    lf_result TYPE c.

  " Check if file exists
  CALL METHOD cl_gui_frontend_services=>file_exist
    EXPORTING
      file                 = uf_filename
    RECEIVING
      result               = lf_result
    EXCEPTIONS
      cntl_error           = 1
      error_no_gui         = 2
      wrong_parameter      = 3
      not_supported_by_gui = 4
      OTHERS               = 5.

  IF sy-subrc NE 0 OR lf_result NE abap_true.
    cf_ok = abap_false.
  ELSE.
    cf_ok = abap_true.
  ENDIF.

ENDFORM.                    "at_selection_screen_pa_filei

FORM read_file USING if_filename TYPE string.
  DATA: lt_lines   TYPE tt_string,
        lf_line    TYPE string,
        lf_linenum TYPE sy-index,
        lf_colnum  TYPE sy-index,
        lt_columns TYPE tt_string,
        lf_column  TYPE string.


  CALL METHOD cl_gui_frontend_services=>gui_upload
    EXPORTING
      filename = if_filename
    CHANGING
      data_tab = lt_lines.

  lf_linenum = 0.
  LOOP AT lt_lines INTO lf_line.
    lf_linenum = sy-tabix.
    CALL FUNCTION 'RSDS_CONVERT_CSV'
      EXPORTING
        i_data_sep       = ','
        i_esc_char       = '"'
        i_record         = lf_line
        i_field_count    = 9999
      IMPORTING
        e_t_data         = lt_columns
      EXCEPTIONS
        escape_no_close  = 1
        escape_improper  = 2
        conversion_error = 3
        OTHERS           = 4.

    LOOP AT lt_columns INTO lf_column.
      lf_colnum = sy-tabix.
      WRITE: / 'row:', lf_linenum, 'col:', lf_colnum, 'content:', lf_column.
    ENDLOOP.

  ENDLOOP.
ENDFORM.

Here is a sample input file:

Vorname,Nachname,Strasse,PLZ,Ort
Donald,Duck,"Gänseweg 7, Hinterhaus",25337,Entenhausen
Dagobert,Duck,Am Geldspeicher 1,25336,Entenhausen

This is the result:

read csv file

row:          1  col:          1  content: Vorname
row:          1  col:          2  content: Nachname
row:          1  col:          3  content: Strasse
row:          1  col:          4  content: PLZ
row:          1  col:          5  content: Ort
row:          2  col:          1  content: Donald
row:          2  col:          2  content: Duck
row:          2  col:          3  content: Gänseweg 7, Hinterhaus
row:          2  col:          4  content: 25337
row:          2  col:          5  content: Entenhausen
row:          3  col:          1  content: Dagobert
row:          3  col:          2  content: Duck
row:          3  col:          3  content: Am Geldspeicher 1
row:          3  col:          4  content: 25336
row:          3  col:          5  content: Entenhausen

The structure of a CSV file: field separator and escape sequences.

The fundamental idea of a CSV file is described by its name: comma-separated values. This means each line contains several values, several fields. And they are separated from each other by commas.

Let’s take a look at an example:

Donald,Duck,Gänseweg 7,25337,Entenhausen
Dagobert,Duck,Am Geldspeicher 1,25336,Entenhausen

The file consists of individual lines, and each line contains a record. The individual fields of the record are separated by a comma. Unlike an XML file, there is no indication of what kind of data is in each field. Sometimes, you may find a header row in the CSV file, such as:

Vorname,Nachname,Strasse,PLZ,Ort

Now, it may happen that a comma appears within an individual field. In such cases, the field is additionally enclosed in quotation marks. Let’s play with one of the addresses as an example:

Vorname,Nachname,Strasse,PLZ,Ort
Donald,Duck,"Gänseweg 7, Hinterhaus",25337,Entenhausen
Dagobert,Duck,Am Geldspeicher 1,25336,Entenhausen

And because it is so convenient to mark this, CSV files are often generated in a way that all fields use these quotation marks.

CSV in ABAP: why the naive approach fails

Suppose we already have the line stored in a field named lf_line, and we want to split it into lt_fields:

DATA: lf_line type string,
      lt_fields type standard table of string.


One might now come up with the idea of simply splitting the line using SPLIT and then removing the quotation marks like this:

SPLIT lf_line AT ',' INTO TABLE lt_fields.
REPLACE ALL OCCURRENCES OF '"' IN TABLE lt_fields WITH space.

However, this approach only works if the data fields themselves do not contain any commas or quotation marks. If they do, all subsequent data fields will be shifted. In the best case, your program may terminate due to the subsequent processing failures. Perhaps you unintentionally write a location name into a numeric field, causing a hard program termination.

In the worst case, you may not even notice the issue, and you end up writing the street name into the postal code and the postal code into the city field. It could take a very long time before such an error is noticed by someone.

Therefore, it is crucial to handle escape sequences correctly.

Properly handling escape sequences can be complex, but fortunately, you don’t have to write it from scratch because it’s already available in SAP. The function module responsible for this is called RSDS_CONVERT_CSV, and it takes care of the work for you. You can take a look at the source code of RSDS_CONVERT_CSV if you’re interested. Writing and debugging such functionality yourself would take a considerable amount of time.

  CALL FUNCTION 'RSDS_CONVERT_CSV'
    EXPORTING
      i_data_sep       = ','
      i_esc_char       = '"'
      i_record         = lf_line
      i_field_count    = 9999
    IMPORTING
      e_t_data         = lt_fields
    EXCEPTIONS
      escape_no_close  = 1
      escape_improper  = 2
      conversion_error = 3
      OTHERS           = 4.
  IF sy-subrc <> 0.
    MESSAGE 'Fehler beim Importieren der Datei' TYPE 'E'.
  ENDIF.

Please note: You can define the delimiter yourself in this case. Usually, it is a comma. However, sometimes other delimiters are used, such as a semicolon.

As an escape character, I have only seen the double quotation mark so far, but in principle, it could be any character.

So, with this function module, we already have the core of the program. Now, we just need to build the framework around it: file selection, file opening, and looping through the lines.

How to select a file interactively

The input file is a parameter of the report. To provide a file selection dialog here, we register an input help:

AT SELECTION-SCREEN ON VALUE-REQUEST FOR p_file.

  PERFORM at_selection_screen_val_file CHANGING p_file.

At the heart of this function, this is how to generate the popup file selector:

CALL METHOD cl_gui_frontend_services=>file_open_dialog
    EXPORTING
*     window_title            =
      default_extension       = 'CSV'
      file_filter             = '*.CSV'
*     with_encoding           =
    CHANGING
      file_table              = lt_filetable
      rc                      = lf_return
      user_action             = lf_action
*     file_encoding           =
    EXCEPTIONS
      file_open_dialog_failed = 1
      cntl_error              = 2
      error_no_gui            = 3
      not_supported_by_gui    = 4
      OTHERS                  = 5.
  

Afterward, it is checked whether the dialog was exited with OK or canceled.

Check whether the file exists at all

It is also essential to check before opening whether the file even exists. If not, the user can be informed immediately.

FORM check_file_exists USING uf_filename TYPE string
                       CHANGING cf_ok TYPE abap_bool.

  DATA:
    lf_result TYPE c.

  " Check if file exists
  CALL METHOD cl_gui_frontend_services=>file_exist
    EXPORTING
      file                 = uf_filename
    RECEIVING
      result               = lf_result
    EXCEPTIONS
      cntl_error           = 1
      error_no_gui         = 2
      wrong_parameter      = 3
      not_supported_by_gui = 4
      OTHERS               = 5.

  IF sy-subrc NE 0 OR lf_result NE abap_true.
    cf_ok = abap_false.
  ELSE.
    cf_ok = abap_true.
  ENDIF.

ENDFORM.                    "at_selection_screen_pa_filei

How to read the file

For reading the file, we again use an SAP standard function:

FORM read_file USING if_filename TYPE string.
  DATA: lt_lines   TYPE tt_string,
        lf_line    TYPE string,
        lf_linenum TYPE sy-index,
        lf_colnum  TYPE sy-index,
        lt_columns TYPE tt_string,
        lf_column  TYPE string.


  CALL METHOD cl_gui_frontend_services=>gui_upload
    EXPORTING
      filename = if_filename
    CHANGING
      data_tab = lt_lines.

  lf_linenum = 0.
  LOOP AT lt_lines INTO lf_line.
    lf_linenum = sy-tabix.
    CALL FUNCTION 'RSDS_CONVERT_CSV'
      EXPORTING
        i_data_sep       = ','
        i_esc_char       = '"'
        i_record         = lf_line
        i_field_count    = 9999
      IMPORTING
        e_t_data         = lt_columns
      EXCEPTIONS
        escape_no_close  = 1
        escape_improper  = 2
        conversion_error = 3
        OTHERS           = 4.

    LOOP AT lt_columns INTO lf_column.
      lf_colnum = sy-tabix.
      WRITE: / 'row:', lf_linenum, 'col:', lf_colnum, 'content:', lf_column.
    ENDLOOP.

  ENDLOOP.


Here, the file is simply read into a local string table in memory (not in the database).

Next, a loop is used, and each of the lines is broken down.

At this point in your program, you would need to implement logic for what you want to do with the data. Perhaps there is a data type consisting of individual fields, and you assign the values and collect them in a table lt_address of type ts_address, or something similar.

Summary


For splitting a CSV string, there is a standard function module called RSDS_CONVERT_CSV.

There are also standard functions available for the surrounding tasks, such as file selection, checking for file existence, and reading the file.

More ABAP tips can be found here.

Recent Posts