Dataiku Key Capabilities
Data Preparation
The Dataiku visual flow allows coders and non-coders alike to easily build data pipelines with datasets, recipes to join and transform datasets, and the ability to build predictive models.
The visual flow also has code and reusable plugin elements for customization and advanced functions.
EXPLOREVisualization
Dataiku saves time with quick visual analysis of columns, including the distribution of values, top values, outliers, invalids, and overall statistics.
For categorical data, the visual analysis includes the distribution by value, including the count and % of values for each value.
EXPLOREMachine Learning
To aid in the feature engineering process, Dataiku AutoML automatically fills missing values and converts non-numeric data into numerical values using well-established encoding techniques.
Users can also create new features using formulas, code, or built-in visual recipes to provide additional signals to improve model accuracy. Once created, Dataiku stores feature engineering steps in recipes for reuse in scoring and model retraining.
EXPLOREDataOps
Dataiku projects are the central place for all work and collaboration for users. Each Dataiku project has a visual flow, including the pipeline of datasets and recipes associated with the project.
Users can view the project and associated assets (like dashboards), check the project’s overall status, and view recent activity.
EXPLOREMLOps
The Dataiku unified deployer manages project files’ movement between Dataiku design nodes and production nodes for batch and real-time scoring. Project bundles package everything a project needs from the design environment to run on the production environment.
With Dataiku, data scientists can see all the deployed bundles, and data engineers of IT operations can quickly know when a new bundle requires testing and roll-out.
EXPLOREAnalytic Apps
Dataiku makes it easy to create project dashboards and share them with business users. Scheduling updates for dashboards or triggering updates is easy and ensures the latest information is available.
With dashboards as part of a Dataiku project, business users and project stakeholders can easily see the outputs of AI projects and track KPIs and value.
EXPLORECollaboration
Real advanced analytics projects require a series of steps that transform data from one state to the next, resulting in new datasets, features, metrics, charts, dashboards, predictive models, and applications.
The Dataiku visual flow is the canvas where teams collaborate on data projects. With the visual flow, everyone on the team can use common objects and visual language to describe the step-by-step approach and document the entire data process for future users.
EXPLOREGovernance
Dataiku permissions control who on the team can access, read, and change a project. Permissions also include creating projects, executing code, executing applications, reading only content, and more. With Dataiku, users can belong to more than one group and have different permissions across projects, or organizations can have global permissions.
EXPLOREExplainability
Dataiku provides critical capabilities for explainable AI, including reports on feature importance, partial dependence plots, subpopulation analysis, and individual prediction explanations.
Together, these techniques can help explain how a model makes decisions and enable data scientists and key stakeholders to understand the factors influencing model predictions.
EXPLOREArchitecture
Dataiku can run on-premise or in the cloud — with supported instances on Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure — integrating with storage and various computational layers for each cloud.
EXPLORE