Python API Client Methods
Python wrappers are implemented by the ApiClient
class. To obtain an instance, simply call api_client
.
For example:
client = dw.api_client
Datasets
Create a new dataset
create_dataset
(owner_id,**kwargs)
Parameters
- owner_id (str) – Username of the owner of the new dataset
- title (str) – Dataset title (will be used to generate dataset id on creation)
- description (str, optional) – Dataset description
- summary (str, optional) – Dataset summary markdown
- tags (list, optional) – Dataset tags
- license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license
- visibility ({'OPEN', 'PRIVATE'}) – Dataset visibility
- files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
Returns
- Newly created dataset key
Return type
- str
Raises
RestApiException – If a server error occurs
For example:
import data.world as dw
api_client = dw.api_client()
url = 'http://www.acme.inc/example.csv'
api_client.create_dataset('username',
title='Test dataset',
visibility='PRIVATE',
license='Public Domain',
files={'dataset.csv': {'url': url}})
Update an existing dataset
update_dataset
(dataset_key,**kwargs)
Parameters
- description (str, optional) – Dataset description
- summary (str, optional) – Dataset summary markdown
- tags (list, optional) – Dataset tags
- license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license
- visibility ({'OPEN', 'PRIVATE'}, optional) – Dataset visibility
- files (dict, optional) – File names and source URLs to add or update
- dataset_key (str) – Dataset identifier, in the form of owner/id
Raises
RestApiException – If a server error occurs
For example:
import data.world as dw
api_client = dw.api_client()
api_client.update_dataset('username/test-dataset', tags=['demo', 'datadotworld'])
Replace an existing dataset
replace_dataset
(dataset_key,**kwargs)
This method will completely overwrite an existing dataset.
Parameters
- description (str, optional) – Dataset description
- summary (str, optional) – Dataset summary markdown
- tags (list, optional) – Dataset tags
- license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Dataset license
- visibility ({'OPEN', 'PRIVATE'}) – Dataset visibility
- files (dict, optional) – File names and source URLs to add or update
- dataset_key (str) – Dataset identifier, in the form of owner/id
Raises
RestApiException – If a server error occurs
For example:
import data.world as dw
api_client = dw.api_client()
api_client.replace_dataset('username/test-dataset',
visibility='PRIVATE',
license='Public Domain',
description='A better description')
Get an existing dataset definition
get_dataset
(dataset_key)
This method retrieves metadata about an existing
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
Returns
- Dataset definition, with all attributes
Return type
- dict
Raises
- RestApiException – If a server error occurs
For example:
import data.world as dw
api_client = dw.api_client()
intro_dataset = api_client.get_dataset('jonloyens/an-intro-to-dataworld-dataset')
print(intro_dataset['title'])
Delete a dataset and all associated data
delete_dataset
(dataset_key)
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
Raises
- RestApiException – If a server error occurs
For example:
import data.world as dw
api_client = dw.api_client()
api_client.delete_dataset('username/dataset')
Fetch authenticated user owned datasets
fetch_datasets
(**kwargs)
Parameters
- limit (str, optional) – Maximum number of items to include in a page of results
- next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)
- sort (str, optional) – Property name to sort
Returns
- Dataset definition, with all attributes
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples:
import data.world as dw
api_client = dw.api_client()
user_owned_dataset = api_client.fetch_datasets()
Download a dataset
download_dataset
(dataset_key)
Return a .zip containing all files within the dataset as uploaded.
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
Returns
- .zip file contain files within dataset
Return type
- file object
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.download_dataset('username/test-dataset')
Download datapackage
download_datapackage
(dataset_key, dest_dir)
Download and unzip a dataset’s datapackage
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- dest_dir (str or path) – Directory under which datapackage should be saved
Returns
- Location of the datapackage descriptor (datapackage.json) in the local filesystem
Return type
- path
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
datapackage_descriptor = api_client.download_datapackage(
'jonloyens/an-intro-to-dataworld-dataset',
'/tmp/test'
)
datapackage_descriptor = '/tmp/test/datapackage.json'
SPARQL
sparql
(dataset_key, query, desired_mimetype='application/sparql-results+json', **kwargs)
Executes SPARQL queries against a dataset via POST
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- query (str) – SPARQL query
Returns
- file object that can be used in file parsers and data handling modules.
Return type
- file object
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.sparql_post('username/test-dataset', 'query')
SQL
sql
(dataset_key, query, desired_mimetype='application/json', **kwargs)
Executes SQL queries against a dataset via POST
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- query (str) – SQL query
- include_table_schema (bool) – Flags indicating to include table schema in the response
Returns
- file object that can be used in file parsers and data handling modules.
Return type
- file-like object
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.sql('username/test-dataset','query')
Manage files
Add files from URL
The add_files_via_url()
function can be used to add files to a dataset from a URL.
This can be done by specifying files
as a dictionary where the keys are the desired file name and each item is an object containing url
, description
and labels
.
add_files_via_url
(dataset_key, files={})
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- files (dict) – Dict containing the name of files and metadata Uses file name as a dict containing File description, labels and source URLs to add or update (Default value = {}) description and labels are optional.
Raises
RestApiException – If a server error occurs
For example:
>>> client = dw.api_client()
>>> client.add_files_via_url('username/test-dataset', files={'sample.xls': {'url':'http://www.sample.com/sample.xls', 'description': 'sample doc', 'labels': ['raw data']}})
Upload dataset files
upload_files
(dataset_key, files, files_metadata={},**kwargs)
Upload one or more dataset files
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- files (list of str) – The list of names/paths for files stored in the local filesystem
- expand_archives – Boolean value to indicate files should be expanded upon upload
- files_metadata (dict optional) – Dict containing the name of files and metadata Uses file name as a dict containing File description, labels and source URLs to add or update
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.upload_files('username/test-dataset', ['/my/local/example.csv'])
Delete dataset file(s)
delete_files
(dataset_key, names)
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- names (list of str) – The list of names for files to be deleted
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.delete_files('username/test-dataset', ['example.csv'])
Files synchronization
sync_files
(dataset_key)
Trigger synchronization process to update all dataset files linked to source URLs.
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.sync_files('username/test-dataset')
Download a file
download_file
(dataset_key, file)
Return a file within the dataset as uploaded.
Parameters
- dataset_key (str) – Dataset identifier, in the form of owner/id
- file (str) – File path to be returned
Returns
- file in which the data was uploaded
Return type
- file object
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.download_file('username/test-dataset', '/my/local/example.csv')
User data
Get user data
get_user_data
()
Retrieve data for authenticated user
Returns
- User data, with all attributes
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
user_data = api_client.get_user_data()
Fetch contributing datasets
fetch_contributing_datasets
(**kwargs)
Fetch datasets that the authenticated user has access to
Parameters
- limit (str, optional) – Maximum number of items to include in a page of results
- next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)
- sort (str, optional) – Property name to sort
Returns
- Authenticated user dataset
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
user_dataset = api_client.fetch_contributing_datasets()
Fetch liked datasets
fetch_liked_datasets
(**kwargs)
Fetch datasets that authenticated user likes
Parameters
- limit (str, optional) – Maximum number of items to include in a page of results
- next (str, optional) – Token from previous result page (to be used when requesting a subsequent page)
- sort (str, optional) – Property name to sort
Returns
- Dataset definition, with all attributes
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
user_liked_dataset = api_client.fetch_liked_datasets()
Projects
Create project
create_project
(owner_id, **kwargs)
Create a new project
Parameters
- owner_id (str) – Username of the creator of a project.
- title (str) – Project title (will be used to generate project id on creation)
- objective (str, optional) – Short project objective.
- summary (str, optional) – Long-form project summary.
- tags (list, optional) – Project tags. Letters numbers and spaces
- license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license
- visibility ({'OPEN', 'PRIVATE'}) – Project visibility
- files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
- linked_datasets (list of object, optional) – Initial set of linked datasets.
Returns
- Newly created project key
Return type
- str
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.create_project(
'username',
title='project testing',
visibility='PRIVATE',
linked_datasets=[{'owner': 'someuser', 'id': 'somedataset'}
Update project
update_project
(project_key, **kwargs)
Update an existing project
Parameters
- project_key (str) – Username and unique identifier of the creator of a project in the form of owner/id.
- title (str) – Project title
- objective (str, optional) – Short project objective.
- summary (str, optional) – Long-form project summary.
- tags (list, optional) – Project tags. Letters numbers and spaces
- license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license
- visibility ({'OPEN', 'PRIVATE'}) – Project visibility
- files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
- linked_datasets (list of object, optional) – Initial set of linked datasets.
Returns
- message object
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.update_project(
'username/test-project',
tags=['demo', 'datadotworld']
)
Replace project
replace_project
(project_key, **kwargs)
Replace an existing Project
Create a project with a given id or completely rewrite the project, including any previously added files or linked datasets, if one already exists with the given id.
Parameters
- project_key (str) – Username and unique identifier of the creator of a project in the form of owner/id.
- title (str) – Project title
- objective (str, optional) – Short project objective.
- summary (str, optional) – Long-form project summary.
- tags (list, optional) – Project tags. Letters numbers and spaces
- license ({'Public Domain', 'PDDL', 'CC-0', 'CC-BY', 'ODC-BY', 'CC-BY-SA', 'ODC-ODbL', 'CC BY-NC', 'CC BY-NC-SA', 'Other'}) – Project license
- visibility ({'OPEN', 'PRIVATE'}) – Project visibility
- files (dict, optional Description and labels are optional) – File name as dict, source URLs, description and labels() as properties
- linked_datasets (list of object, optional) – Initial set of linked datasets.
Returns
- project object
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.replace_project(
'username/test-project',
visibility='PRIVATE',
objective='A better objective',
title='Replace project'
)
Get project
get_project
(project_key)
Retrieve an existing project
This method retrieves metadata about an existing project
Parameters
- project_key (str) – Project identifier, in the form of owner/id
Returns
- Project definition, with all attributes
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
intro_project = api_client.get_project('jonloyens/an-example-project-that-shows-what-to-put-in-data-world')
Fetch contributing projects
fetch_contributing_projects
(**kwargs)
Fetch projects that the currently authenticated user has access to
Returns
- Authenticated user projects
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
user_projects = api_client.fetch_contributing_projects()
Fetch liked projects
fetch_liked_projects`(**kwargs)
Fetch projects that the currently authenticated user likes
Returns
- Authenticated user projects
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
user_liked_projects = api_client.fetch_liked_projects()
Fetch projects
fetch_projects
(**kwargs)
Fetch projects that the currently authenticated user owns
Returns
- Authenticated user projects
Return type
- dict
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
user_projects = api_client.fetch_projects()
Add linked datasets
add_linked_dataset
(project_key, dataset_key)
Link project to an existing dataset
This method links a dataset to project
Parameters
- project_key (str) – Project identifier, in the form of owner/id
- dataset_key – Dataset identifier, in the form of owner/id
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
linked_dataset = api_client.add_linked_dataset(
'username/test-project',
'username/test-dataset'
)
Remove linked dataset
remove_linked_dataset
(project_key, dataset_key)
Unlink dataset
This method unlinks a dataset from a project
Parameters
- project_key (str) – Project identifier, in the form of owner/id
- dataset_key – Dataset identifier, in the form of owner/id
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.remove_linked_dataset(
'username/test-project',
'username/test-dataset'
)
Delete project
delete_project
(project_key)
Deletes a project and all associated data
Parameters
- project_key (str) – Project identifier, in the form of owner/id
Raises
- RestApiException – If a server error occurs
Examples
import datadotworld as dw
api_client = dw.api_client()
api_client.delete_project('username/test-project')
Insights
Get insight
get_insight
(project_key, insight_id, **kwargs)
Retrieve an insight
Parameters
- project_key (str) – Project identifier, in the form of projectOwner/projectid
- insight_id (str) – Insight unique identifier.
Returns
- Insight definition, with all attributes
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
python
import data.world as dw
api_client = dw.api_client()
insight = api_client.get_insight(
'jonloyens/an-example-project-that-shows-what-to-put-in-data-world',
'c2538b0c-c200-474c-9631-5ff4f13026eb'
)
insight['title'] = 'Coast Guard Lives Saved by Fiscal Year'
Get insights for project
get_insights_for_project
(project_key, **kwargs)
Get insights for a project.
Parameters
- project_key (str) – Project identifier, in the form of projectOwner/projectid
Returns
- Insight results
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
insights = api_client.get_insights_for_project(
'jonloyens/an-example-project-that-shows-what-to-put-in-data-world'
)
Create insight
create_insight
(project_key, **kwargs)
Create a new insight
Parameters
- project_key (str) – Project identifier, in the form of
- title (str) – Insight title
- description (str, optional) – Insight description.
- image_url (str) – If image-based, the URL of the image
- embed_url (str) – If embed-based, the embeddable URL
- source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.
- data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.
Returns
- Insight with message and uri object
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.create_insight(
'projectOwner/projectid',
title='Test insight',
image_url='url'
)
Replace insight
replace_insight
(project_key, insight_id, **kwargs)
Replace an insight.
Parameters
- project_key (str) – Projrct identifier, in the form of projectOwner/projectid
- insight_id (str) – Insight unique identifier.
- title (str) – Insight title
- description (str, optional) – Insight description.
- image_url (str) – If image-based, the URL of the image
- embed_url (str) – If embed-based, the embeddable URL
- source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.
- data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.
Returns
- message object
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.replace_insight(
'projectOwner/projectid',
'1230-9324-3424242442',
embed_url='url',
title='Test insight'
)
Update insight
update_insight
(project_key, insight_id, **kwargs)
Update an insight.
Note that only elements included in the request will be updated. All omitted elements will remain untouched.
Parameters
- project_key (str) – Projrct identifier, in the form of projectOwner/projectid
- insight_id (str) – Insight unique identifier.
- title (str) – Insight title
- description (str, optional) – Insight description.
- image_url (str) – If image-based, the URL of the image
- embed_url (str) – If embed-based, the embeddable URL
- source_link (str, optional) – Permalink to source code or platform this insight was generated with. Allows others to replicate the steps originally used to produce the insight.
- data_source_links (array) – One or more permalinks to the data sources used to generate this insight. Allows others to access the data originally used to produce the insight.
Returns
- message object
Return type
- object
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
api_client.update_insight(
'username/test-project',
'insightid',
title='demo datadotworld'
)
Delete insight
delete_insight
(project_key, insight_id)
Delete an existing insight.
Parameters
- project_key (str) – Project identifier, in the form of projectOwner/projectId
- insight_id (str) – Insight unique id
Raises
- RestApiException – If a server error occurs
Examples
import data.world as dw
api_client = dw.api_client()
del_insight = api_client.delete_insight(
'username/project',
'insightid'
)
You can find more about those functions using help(client)
Updated about 2 months ago