CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Releases: datahub-project/datahub
v1.2.0
Compare
What's Changed
- ci: don't rerun docker workflows on labels by @hsheth2 in #13405
- test(audit-events): updates for audit event tests by @david-leifker in #13419
- feat(ingest): associate queries with operations by @hsheth2 in #13404
- Update search results page by @annadoesdesign in #13303
- chore(airflow): update dev mypy to 1.14.1 by @anshbansal in #13374
- feat(changeSyncAction): support RESTATE type syncs by @RyanHolstien in #13406
- feat(ui/lineage): Make show ghost entities toggle local storage sticky by @asikowitz in #13424
- ci: Add yaml format check by @asikowitz in #13407
- fix(graphql): remove false deprecation note by @jayacryl in #13402
- feat(ingestion): Make jsonProps of schemaMetadata less verbose by @skrydal in #13416
- feat(sdk): scaffold assertion client by @anthonyburdi in #13362
- fix(): DUE Producer Configuration & tracking message validation by @david-leifker in #13427
- fix(build): fix version in jars by @chakru-r in #13432
- fix(smoke-test): fix flakiness of audit smoke test by @RyanHolstien in #13429
- fix(ingest/snowflake): fix previously broken tests by @hsheth2 in #13428
- fix(searchBarAutocomplete): ui tweaks by @v-tarasevich-blitz-brain in #13430
- fix(docker): Fix for metadata ingestion docker build by @treff7es in #13435
- fix(ui) Add ellipses and tooltip to long names on home page header by @chriscollins3456 in #13425
- updated search menu items after search update by @annadoesdesign in #13422
- docs(release): Adding notes for v0.3.10.3 release by @jjoyce0510 in #13437
- fix(ingest/tableau): Fix infinite loop in Tableau retry by @treff7es in #13442
- fix(cli): ignore extra configs by @anshbansal in #13444
- fix(ingest/snowflake): parsing issues with empty queries by @anshbansal in #13446
- docs: remove old pages & assets by @yoonhyejin in #13367
- fix(build): fix local quickstart builds by @chakru-r in #13445
- docs: Adding color to 3.10 release notes by @jayacryl in #13448
- fix(ingest/mode): Not failing if queries endpoint returns 404 by @treff7es in #13447
- chore(avro): bump parquet-avro version by @esteban in #13452
- ci(): run smoke tests on release by @david-leifker in #13454
- feat(UI): funnel subtype for dataflows and datajobs all the way to the UI by @gabe-lyons in #13455
- improvement(ui): add wrapper component for stop propagation by @purnimagarg1 in #13434
- chore(ingest): bump bounds on cooperative timeout test by @hsheth2 in #13449
- feat(docs): Add 0.3.10.4 hotfix release notes by @pedro93 in #13458
- fix(authentication) redirection for native login and sso to function within iframes by @jayacryl in #13453
- fix(docs): Add feature availability to audit API by @pedro93 in #13459
- docs: remove markprompt by @yoonhyejin in #13463
- docs: add runllm chatbot by @yoonhyejin in #13464
- feat(sdk): add datajob lineage & dataset sql parsing lineage by @yoonhyejin in #13365
- fix(versioning): Properly set versioning scheme on unlink; always run side effects by @asikowitz in #13440
- feat(sdk): update lineage sample script to use client.lineage by @yoonhyejin in #13467
- feat(docs): Add doc links for 1.0.0 by @pedro93 in #13462
- docs: remove 0.15.0 from archived list by @yoonhyejin in #13468
- fix(docs): Add requirement on yarn for documentation for local development by @pedro93 in #13461
- feat(ingestion/kafka): Add optional externalURL base for link to external platform by @acrylJonny in #12675
- feat(ingest): filter by database in superset and preset by @kevinkarchacryl in #13409
- fix(web) domain search result item visual cleanup by @jayacryl in #13474
- fix(test): prevent audit test flakiness by @RyanHolstien in #13475
- fix(ingest/hive): Fix hive properties with double colon by @treff7es in #13478
- fix(sdk): use pluralized assertions by @anthonyburdi in #13481
- docs: Add show manage tags environment var by @asikowitz in #13482
- docs(): v0.3.11 DataHub Cloud Docs by @david-leifker in #13439
- tests(ingestion): fixes hex and hive docker flakiness by @sgomezvillamor in #13476
- fix(sdk): always url-encode entity links by @hsheth2 in #13483
- tests(smoke): removes non existing
mix_stderr
param inCliRunner
by @sgomezvillamor in #13485 - docs: Adding notes on remote executors handling smart assertions by @jayacryl in #13479
- fix(cli): warn more strongly about hard deletion by @anshbansal in #13471
- fix(ui): enable to edit tag when properties aspect was not present by @anshbansal in #13470
- docs(cloud): fix indent, remove note not relevant for cloud users by @anshbansal in #13493
- Adding KafkaClients dependency to the datahub-upgrade module by @RafaelFranciscoLuqueCerezo in #13488
- feat(cassandra): Support ssl auth with cassandra by @gabe-lyons in #13465
- fix(ingest/presto): Presto/Trino property extraction fix by @treff7es in #13487
- chore(): graphiql latest versions by @david-leifker in #13484
- feat(ingest/sql): column logic + join extraction by @hsheth2 in #13426
- chore(hex): debug logs by @sgomezvillamor in #13473
- fix(mssql): improve stored proc lineage + add
temporary_tables_pattern
config by @sgomezvillamor in #13415 - fix(ingest/mode): Additional pagination and timing metrics by @mminichino in #13497
- feat(config): add configurable search filter min length by @RyanHolstien in #13499
- ci(workflow): postgres consolidation & release unit tests by @david-leifker in #13500
- fix(config): fix mcp consumer batch property by @david-leifker in #13504
- fix(ci): enable publish on release by @david-leifker in #13506
- feat(ingestion/looker): extract group_labels from looker and add as tags in datahub by @acrylJonny in #13503
- Refactor elasticsearch search indexed by @jmacryl in #13451
- fix(ingest/mode): Additional 404 handling and caching update by @mminichino in #13508
- feat(platform): up limit of corpuser char length from 64 to 128 by @acrylJonny in #13510
- feat(ingest): improve join extraction by @hsheth2 in #13502
- feat(ingest): support pydantic v2 in mysql source by @hsheth2 in #13501
- fix(ci): metadata-io test by @david-leifker in #13514
- build(deps): bump gradle/gradle-build-action from 2 to 3 by @dependabot[bot] in #12951
- build(deps): bump actions/cache from 3 to 4 by @dependabot[bot] in #13346
- Support di...
Assets 2
v1.2.0rc4
Compare
fix(tests): disable auth for spark test via env var instead of env fiβ¦
Assets 2
v1.2.0rc3
Compare
feat(ui) Add banner on v1 home page to warn of v2 UI deprecation (#14β¦
Assets 2
v1.2.0rc2
435f20b
Compare
fix(ci): adjust runner size (#14135)
Assets 2
v1.2.0rc1
Compare
fix(ingest/athena): Make Athena simple column v1 conversion optional β¦
Assets 2
v1.1.0
Compare
What's Changed
- dataset cli - add support for schema, round-tripping to yaml by @chakru-r in #12764
- feat(ingestion/superset): ownership info for charts, dashboards and datasets by @PeteMango in #12750
- feat(ingest): allowdenypattern for dashboard, chart, dataset in superset by @kevinkarchacryl in #12782
- feat(models): adds subtypes to most entities in the model by @shirshanka in #12783
- fix: fixes mypy complaints about pkgresources by @sgomezvillamor in #12790
- fix(ingestion): fixes producing some URNs with reserved characters by @sgomezvillamor in #12772
- feat(okta): custom properties for okta user by @sgomezvillamor in #12773
- feat(mssql): adds subtypes aspect for dataflow and datajobs by @sgomezvillamor in #12775
- feat(searchBarAutocomplete): add feature flag for search bar's autocomplete redesign by @v-tarasevich-blitz-brain in #12690
- fix(ingest): enable fuzzy case resolution for oracle sql by @hsheth2 in #12778
- style: update azure.md by removing extra word by @alexbransky in #12780
- fix(ui): change tags to properties in ml model view by @yoonhyejin in #12789
- fix(ui) Fix changing color and icon for domains in UI by @chriscollins3456 in #12792
- Support container in ML Model Group, Model and Deployment by @ryota-cloud in #12793
- docs: update mlflow ingestion docs to include new concept mappings by @yoonhyejin in #12791
- fix(web) move form entity sidebar to right to align with cloud by @jayacryl in #12796
- doc(iceberg): iceberg doc updates by @chakru-r in #12787
- docs: add exporting from source to write mcp guide by @yoonhyejin in #12800
- feat(ingest/redshift): support for datashares lineage by @mayurinehate in #12660
- feat(ingestion/business-glossary): Automatically generate predictable glossary term and node URNs when incompatible URL characters are specified in term and node names. by @acrylJonny in #12673
- fix(ingestion/oracle): Improved foreign key handling by @acrylJonny in #11867
- feat(ingest/iceberg): Introduce network problems resiliency for Iceberg source by @skrydal in #12804
- chore(postgres): bump version by @david-leifker in #12808
- chore(aws): bump aws libraries by @david-leifker in #12809
- feat(api): URN, Entity, and Aspect name Async Validation by @david-leifker in #12797
- feat(ingest): improve extract-sql-agg-log command by @hsheth2 in #12803
- fix(UI): Showing platform instances only once by @sakethvarma397 in #12806
- fix: search cache invalidation for iceberg entities by @chakru-r in #12805
- feat(docs): Release for DataHub Cloud 0.3.8.2 by @pedro93 in #12811
- refactor(graphql): simplify getLastIngestionRun method by @trialiya in #12706
- docs(ingest): update metadata-ingestion dev guide by @hsheth2 in #12779
- fix(ingest/oracle): refresh golden files by @hsheth2 in #12818
- fix(openapi): fix openapi timeseries async ingestion by @david-leifker in #12812
- docs(ingest/mode): update mode workspace docs by @hsheth2 in #12774
- fix(ingestion/superset): fixed iterate over int error for building urns by @PeteMango in #12807
- fix(doc): Disable Algolia search by @treff7es in #12831
- fix(build): build improvements to help with incremental builds by @chakru-r in #12823
- feat(docs) add perms req to ai docs by @jayacryl in #12819
- Add variable to show full title in lineage by default by @Blize in #12078
- fix(doc): re-enable Algolia search by @hsheth2 in #12834
- feat(ui): support all entities with display names in browse paths v2 by @Masterchen09 in #11657
- feat(ingestion/mlflow): improve mlflow connector to pull run and experiments by @yoonhyejin in #12587
- fix(workflows): Update pr-labeler by @asikowitz in #12835
- chore(ruff): enable some ignored rules by @sgomezvillamor in #12815
- feat(ingest/redshift): lineage for external schema created from redshift by @mayurinehate in #12826
- feat(openapi-ingestion): implement openapi ingestion by @david-leifker in #12757
- fix(ui) Hide default filters we want to hide from impact analysis by @chriscollins3456 in #12843
- fix(ui) Fix submitting when selecting replacement in deprecation modal by @chriscollins3456 in #12842
- fix(UI): Multiple data product delete modals by @sakethvarma397 in #12781
- fix(graphql/search): Remove schema field and data process instance from default search types by @asikowitz in #12845
- docs: clear remote executor docs by @anshbansal in #12839
- build(deps): bump @babel/runtime from 7.24.4 to 7.26.10 in /docs-website by @dependabot in #12844
- build(deps): bump @babel/runtime-corejs3 from 7.24.4 to 7.26.10 in /docs-website by @dependabot in #12846
- build(deps): bump @babel/helpers from 7.24.4 to 7.26.10 in /docs-website by @dependabot in #12847
- fix(jaas): fix jaas login by @david-leifker in #12848
- feat(gql) allow unsetting optional incident fields by @jayacryl in #12801
- fix(ingest/dynamodb): pass env to dataset urn function by @anshbansal in #12853
- feat(models): Add edges fields to data process instance relationship aspects by @asikowitz in #12860
- feat(ui): Update ExternalUrlButton to include self-hosted gitlab URLs by @k7ragav in #12734
- fix(ui) Support glossary nodes in autocomplete by @chriscollins3456 in #12858
- feat(ingest/mlflow): update dpi to use edge for lineage by @yoonhyejin in #12861
- fix(ge-profiler): catch TimeoutError by @sgomezvillamor in #12855
- fix(databricks): fixes profile median by @sgomezvillamor in #12856
- fix(ingest): fix error in deploy command by @hsheth2 in #12820
- docs(ingest): custom transformer remote executor by @anshbansal in #12864
- feat(restore-indices): createDefaultAspects argument by @david-leifker in #12859
- ci(tests):show cypress smoke tests in junit format for better reporting by @chakru-r in #12865
- feat(ingest/salesforce): include formula in in field description by @mayurinehate in #12840
- feat(ingestion-tracing): implement ingestion with tracing api by @david-leifker in #12714
- hotfix(ui): Addressing assertions hotfixes by @jjoyce0510 in #12785
- feat(ingestion) Adding vertexAI ingestion source (v1 - model group and model) by @ryota-cloud in #12632
- feat(ingest/hive): identify partition columns in hive tables by @deepgarg-visa in #12833
- fix(api-tracing): handle corner case for historic by @david-leifker in #12870
- docs(website) update docusaurus config by @maggiehays in #12862
- feat(system-metrics): track api usage by user, client, api by @david-leifker in #12872
- fix(ingest/snowfla...
Assets 2
v1.0.0
Compare
DataHub v1.0.0
Release Highlights
DataHub v1.0.0 is packed with exciting updates, including:
- A completely redesigned user experience focused on simplified navigation and a visually stunning interface.
- Unified support for Data & AI, including AI Model Group Versions, AI Model Lineage, Model Stats, and Experiment/Run ingestion.
- DataHub Iceberg Catalog, allowing users to manage Iceberg tables directly from DataHub.
Read the blog post here!
Changelog
New User Interface: Putting Usability First
With a completely re-designed user interface, DataHub v1.0 represents a fundamental rethinking of how users interact with their metadata and data assets. The new experience includes:
- Intuitive Platform-Based Navigation - Hierarchically browse data by database and schema in Snowflake, BigQuery, Redshift, Databricks, and more. Combine hierarchical navigation with filtering by data owners, domain, tags, and glossary terms to find the right data fast.
- Seamless Lineage Exploration - Our reimagined lineage view features multi-level expansion, name-based search, and column-level visibility, making it easier than ever to understand data relationships and impact.
- Integrated Data Quality - Make confident decisions with deeply integrated quality signals throughout the platform, helping you quickly identify and trust reliable data assets.
DataHub Admins can enable the new UI for all users by setting the THEME_V2_DEFAULT
environment variable to true
; until then, Users can opt into the new experience by navigating to Settings > Appearance > Try New User Experience.
Comprehensive AI Asset Support: Unifying Data and AI
DataHub v1.0 treats AI assets as first-class citizens within the data ecosystem, allowing users to track their entire data-to-AI pipeline in one place.
- Unified Search and Discovery: Seamlessly search across models, model groups, and traditional data assets in one unified interface.
- Advanced Versioning System: Track multiple versions of datasets and ML models with detailed performance metrics and clear linkages between versions.
- Rich Model Statistics: Monitor key metrics across versions, understand performance trends, and make data-driven decisions about model deployment.
- End-to-End Lineage: Trace data flows from raw inputs through models to final outputs, with complete versioning support.
DataHub Iceberg REST Catalog Beta: Simplifying Data Lake Management
This release introduces an integration with Apace Iceberg, allowing users to manage Iceberg tables directly through DataHub, including:
- Create and manage Iceberg tables through DataHub
- Maintain consistent metadata across DataHub and Iceberg
- Facilitate data discovery by exposing Iceberg table metadata in DataHub
- Enable secure access to Iceberg tables through DataHub's permissions model
Read the docs here!
DataHub CLI
This release introduces the following improvements to our CLI:
- Added
container
command to apply tags, terms, and owners on all assets within the container. [ #12418, #12436] - Improved
delete
command to optionally reference a file with a list of URNS to be deleted. [#12247] - Expanded
ingest
command to support ingesting MCPs from S3. [#12649]
Metadata Ingestion
Weβre continuously improving our integrations to add new capabilities and squash bugs.
- dbt: Added the parameter
include_database_name
to support including the database name in URN generation. [#12411] - Iceberg: Alongside our new Iceberg Catalog API, weβve made various improvements to our Iceberg integration. [#12744]
- MLFlow: Significantly revamped our MLFlow connector, adding support for tracking Model Group Versions and Model Stats; tracking Model lineage to underlying datasets; and capturing Experiments and Runs.
- MSSQL: Improved support for extracting stored procedures from MS SQL. [ #12244, #12563]
- Oracle: Improved the accuracy of column-level lineage resolution.
- PowerBI: Improved lineage mapping so PowerBI Reports can now contain PowerBI Dashboards. [#12451]
- Redshift: Added support for data shares and external schemas, including automatic lineage resolution across Redshift namespaces.
- S3: Added functionality to the S3 ingestion process to ignore paths that do not match the specified depth, resolving warning messages triggered by mismatched paths. [#12326]
- Snowflake: Added support for Snowflake Streams and Hybrid Tables, and fixed a bug with lineage resolution across table renames. [#12318]
- Superset: (community contribution!): Added support for Superset virtual datasets and lineage. [#12679]
Additionally, weβre working on a new integration with Vertex AI. Please reach out if youβre interested in joining the beta.
Of course, this only scratches the surface of changes. This release contains 100+ improvements across 25 different integrations.
Thank You to our Contributors!
First-Time Contributors
@Bhadhri03 @brock-acryl @cccs-cat001 @davidebriscese @Deepalijain13 @dougbot01 @Haebuk @haon85 @josges @mihai103 @rajatgl17 @Rasnar @rharisi @samanthafigueredo5 @ttekampe
Repeat Contributors
@bda618 @deepgarg-visa @eagle-25 @jayasimhankv @ksrinath @llance @Masterchen09 @mayurinehate @mkamalas @PeteMango @pinakipb2 @remisalmon @sagar-salvi-apptware @svdimchenko @v-tarasevich-blitz-brain
Project Maintainers
@anshbansal @asikowitz @chakru-r @chriscollins3456 @david-leifker @gabe-lyons @hsheth2 @jayacryl @jjoyce0510 @kevinkarchacryl @pedro93 @RyanHolstien @ryota-cloud @sakethvarma397 @sgomezvillamor @shirshanka @skrydal @treff7es @yoonhyejin
View the full changelog: v0.15.0.1...v1.0.0
Assets 2
v1.0.0rc5
Compare
Full Changelog: v1.0.0rc4...v1.0.0rc5
Assets 2
v1.0.0rc4
Compare
Full Changelog: v1.0.0rc3...v1.0.0rc4
Assets 2
v1.0.0rc3
6097820
Compare
What's Changed
- fix(filters) Fix autocomplete for platforms and improve advanced search builder by @chriscollins3456 in #12560
- fix(ingest): handle groups in pattern_cleanup_ownership transformer by @cccs-cat001 in #12536
- tests(druid): integration tests for druid ingestion by @sgomezvillamor in #12717
- feat(api): let admins use granted privileges for actors by @anshbansal in #12718
- feat(build): use
pull_request_target
for datahub-wheels by @hsheth2 in #12722 - feat(ui): access management docs by @kevinkarchacryl in #12719
- fix(lineage): error message for edit lineage by @anshbansal in #12724
- docs: clarify limits on AI docs by @hsheth2 in #12728
- fix(urn-validation): additional test cases for urn validation by @david-leifker in #12727
- fix(ui) Fix NPE in pluralize function by @chriscollins3456 in #12629
- Fix platform instance support on Druid ingestion by @Rasnar in #12716
- ci(coverage): update patch coverage threshold by @chakru-r in #12733
- fix(ui) Fix bug with date dropdown in deprecation modal by @chriscollins3456 in #12633
- fix(ui) Fix group membership inconsistencies on group page by @chriscollins3456 in #12704
- fix(ui) Properly get display name when downloading search results by @chriscollins3456 in #12720
- fix(ingest): bump avro dep by @hsheth2 in #12729
- fix(ui) Filter healthy assets out of unhealthy upstreams component by @chriscollins3456 in #12705
- docs: update slack link by @hsheth2 in #12731
- fix(build): support datahub-wheels from forked PRs by @hsheth2 in #12730
- docs: add scarf integration by @hsheth2 in #12739
- fix(iceberg-cli): add missing filter for iceberg dataplatform by @chakru-r in #12732
- dev: immutable args remove by @anshbansal in #12735
- build(deps): bump dompurify from 2.5.4 to 3.2.4 in /datahub-web-react by @dependabot in #12643
- refactor(ui): Migrate to use the new Button component consistently by @jjoyce0510 in #12597
- docs(restore-indices): added best practices by @david-leifker in #12741
- feat(ui/lineageV2): Show version pill in lineage sidebar and node by @asikowitz in #12599
- chore(bump): Bump kafka-setup base by @david-leifker in #12743
- dev: enable ruff rule by @anshbansal in #12742
- revert(ci): revert datahub-wheel build changes by @hsheth2 in #12747
- feat: API key support in Metabase source by @rajatgl17 in #12711
- dev: enable ruff rule by @anshbansal in #12749
- refactor(ingest/s3): enhance readability by @eagle-25 in #12686
- feat(ingestion/superset): superset dataset lineage for metadata ingestion by @PeteMango in #12679
- chore(ci): avoid dep on confluent-kafka 2.8.1 by @hsheth2 in #12753
- feat(graphql): implement sort and facet for scroll by @david-leifker in #12746
- feat(ingest): improve error messages for unknown metadata objects by @hsheth2 in #12745
- fix(web) accurate error message for embeddedlistsearch by @jayacryl in #12622
- feat(ingestion/iceberg): Several improvements to iceberg connector by @skrydal in #12744
- fix(ingest): support pydantic v2 in file-based lineage by @hsheth2 in #12723
- feat(iceberg): improve concurrency control and resilience by @ksrinath in #12664
- docs(users+groups): show that you can set title via users YAML by @gabe-lyons in #12767
- feat(sdk): add search client by @hsheth2 in #12754
- feat(operations): ES and Kafka Operations Endpoints by @david-leifker in #12756
- feat(auth): support guest access by @chakru-r in #12619
- fix(iceberg): listnamespaces includes warehouse name as root by @chakru-r in #12761
- feat(UI): make searchbar centered and wider by @v-tarasevich-blitz-brain in #12666
- fix(ui) Fix order of parent containers on v2 autocomplete item by @chriscollins3456 in #12721
- fix(test): handle empty log by @david-leifker in #12768
- fix(lineage) Support views and sorting in impact analysis by @chriscollins3456 in #12769
- feat(versioning): Support entity versioning ingestion by @asikowitz in #12755
- fix(ui): add overflow wrap for dpi / model summary tab & add custom properties in mlmodelgroup queries by @yoonhyejin in #12771
- feat(sdk): add support for institutional memory links by @hsheth2 in #12770
New Contributors
- @Rasnar made their first contribution in #12716
- @rajatgl17 made their first contribution in #12711
- @PeteMango made their first contribution in #12679
- @v-tarasevich-blitz-brain made their first contribution in #12666
Full Changelog: v1.0.0rc2...v1.0.0rc3