Login
Introduction
Technical Documentation
Tasks
Version

Dataset importer

Deprecated IMPORTER
Deprecated since the version : 0.4.4
Dataset is deprecated. Please use Table
Typing name :  TASK.gws_core.DatasetImporter Brick :  gws_core v Parent : 

Import file to Dataset

Generic task that take a file as input and return a resource

Override the import_from_path method to import the file to the destination resource

Supported extensions :  xlsx, xls, csv, tsv, tab, txt

Input

File

Output

Dataset
Data table for statistical and machine learning analysis

Configuration

file_format

Optional

File format

Type : stringAllowed values : xlsx  xls  csv  tsv  tab  txt  Default value : csv

delimiter

Optional

Delimiter character. Only for parsing CSV files

Type : stringAllowed values : auto  tab  space  ,  ;  Default value : auto

header

Optional

Row to use as the column names. By default the first row is used (i.e. header=0). Set header=-1 to not read column names.

Type : int

metadata_columns

Optional

Columns data to use to tag rows of the dataset and also as targets

Type : ListMaximum occurrences number : -1

column

Optional

Column to use to tag rows using metadata.

Type : string

keep_in_table

Optional

Set True to keep metadata in table; False otherwise

Type : boolDefault value : true

is_target

Optional

Set True to use the column as target; False otherwise

Type : boolDefault value : true

index_column

OptionalAdvanced parameter

Column to use as the row names. By default no index is used (i.e. index_column=-1).

Type : intDefault value : -1

decimal

OptionalAdvanced parameter

Character to recognize as decimal point (e.g. use ‘,’ for European/French data).

Type : stringDefault value : .

nrows

OptionalAdvanced parameter

Number of rows to import. Useful to read piece of data.

Type : int

comment

OptionalAdvanced parameter

Character used to comment lines.

Type : stringDefault value : #