Example of CSV Data Load

This sample load exercise shows you how to: 

  • Load a CSV file into a raw data schema

  • Use global variables

  • Create a dimension table

Data File

Information for the data file used for this exercise follows.

Data file name: example_dimension-20200727.csv
Vendor name: TESTVENDOR
Feedname: DIMTABLE
Model Name: TESTVENDOR_DIMTABLE
Format: CSV with comma delimiter
Global columns: yes
Building Dimension table: yes

REPORT_DATE,ENTITY_ID,USER_DATE4,USER_CHAR4,USER_DATE5,USER_NUM4,USER_NUM5,USER_CHAR5,USER_CHAR6,USER_NUM6,USER_DATE6 20170815,ENTTST01,20170101,TEST1,20170102,11.1,11.2,TEST2,TEST3,11.3,20170103 20170816,ENTTST02,20170104,TEST4,20170105,12.1,12.2,TEST5,TEST6,12.3,20170106 20170817,ENTTST03,20170107,TEST7,20170108,13.1,13.2,TEST8,TEST9,13.3,20170109 20170818,ENTTST04,20170110,TEST10,20170111,14.1,14.2,TEST11,TEST12,14.3,20170112 20170819,ENTTST05,20170113,TEST13,20170114,15.1,15.2,TEST14,TEST15,15.3,20170115

Steps to Load

To load CSV data:

  1. Open the Raw Service tool and click Create New.

  2. In the Create new dialog box:
    - Enter the Vendor Name and Feed Name values,
    - Ensure the Format is set to CSV.
    - Click the Create button. 

    Create new dialog box

    The system creates a new vendor/feed combination.

  3. In the edit dialog box that appears, click the Load From File button and upload the data file named example_dimension-20200727.csv.

  4. In the Sample Data tab, enter the following values:
    - Select the Header lines check box and set the related box to 1
    - Select the Header at line number check box and set the related box to 1 
    - Set the Columns Separator list to ,

    Raw Service tool - Data Settings tab
  5. Click the File Name Parameters tab, and provide the file name, example_dimension-20200727, and - (dash) as delimiter.
    This splits the file name at the dash position, and creates a global variable for each of the resulting parts.

  6. Name the variables.
    This allows you to use the date coming from the file name in your loads.

    Raw Service tool - File Parameters tab
  7. Click the Structure Description tab.

  8. Do the following:
    - In the Key column, select the REPORT_DATE and ENTITY_ID field check boxes as key fields.
    - In the Selected column, select FileName_Param_2 so it shows a check mark to the list of selected fields. FileName_Param_2 contains the date portion of file name. 

  9. Click the Dimensions tab, and provide a name for the dimensions table.
    You may also want to remove the GUID part from the Structure name, to make it simpler.

  10. On the Raw Service tab, click Publish.
    The Publish Feed dialog box lists all the underlying files which will be uploaded to the region.

  11. Accept the defaults and proceed by clicking the Publish button in the Publish Feed dialog box.

    Publish Feed box

    The solution is now ready for a test run. 

  12. On the Raw Service tab, click Run and select the test data file.

    Raw Service tool Run option and Run RDS Process box
  13. In the Run RDS Process dialog box, click the Run button.

Review the Results

For information on how to review the results of the load, see Monitor Ingested Data in Raw Service Loads