The Unofficial Snowflake Monthly Release Notes: May 2024

augusto kiniama rosa - Jun 17 - - Dev Community

Monthly Snowflake Unofficial Release Notes #New features #Previews #Clients #Behavior Changes

Welcome to the fantastic Unofficial Release Notes for Snowflake! You’ll find all the latest features, drivers, and more in one convenient place.

As an unofficial source, I am excited to share my insights and thoughts. Let‘’’s dive in! You can also find all of Snowflake’s releases here.

This month, we provide coverage up to release 8.21 and Labels: General Availability — GA. I hope to extend this eventually to private previews notices as well.

I would appreciate your suggestions on continuing to combine these monthly release notes. Feel free to comment below or chat with me on LinkedIn.

What’s New in Snowflake

New Features

  • Preview: Triggered tasks, tasks can run only when the related stream has new data
  • Preview: Trust Center, use the Trust Center to evaluate and monitor your account for security risks in Snowsight
  • General Availability: Aggregation Policies, protect the privacy of individual rows by requiring analysts to run queries that aggregate data rather than retrieving individual rows
  • General Availability: Projection Policies, prevent queries from using a SELECT statement to project a column
  • General Availability: New SYSTEM$SEND_SNOWFLAKE_NOTIFICATION stored procedure for sending notifications, email address or a queue provided by a Cloud service (Amazon SNS, Google Cloud PubSub, or Azure Event Grid)
  • General Availability: ASOF JOIN, joins rows from tables based on proximity
  • GA: Vector data type and Vector similarity functions, VECTOR data type, Vector Similarity Functions, and the Vector Embedding Function, which enables important applications that require semantic vector search and retrieval, includes a couple new related SQL functions VECTOR_INNER_PRODUCT, VECTOR_L2_DISTANCE, VECTOR_COSINE_SIMILARITY and EMBED_TEXT_768 (SNOWFLAKE.CORTEX)
  • Preview: Serverless alerts, alerts can now use serverless compute model
  • General Availability: Cost Insights, cost management tool that lets you identify opportunities for savings within an account
  • General Availability: Structured data types, an ARRAY, OBJECT, or MAP that contains elements or key-value pairs with specific Snowflake data types
  • Preview: Using a Git repository in Snowflake, integrate your remote Git repository with Snowflake so that files from the repository are synchronized to a special kind of stage called a repository stage

Snowsight Updates

  • GA: Finalizer tasks, now linked to the root task of task graphs in the task graph view.
  • Preview: Edit tasks, task editing in Snowsight
  • GA: New Create menu, provides a shortcut for creating the following items: SQL worksheet, Python worksheet, Streamlit App, Dashboard, Table, Stage and View
  • GA: New Add Data page, provides a combined view and quick access to all the data loading methods that Snowflake supports
  • Preview: Data Dictionary with masked PII, allow both providers and consumers to preview data for tables and views associated with listings
  • Preview: Data sharing and collaboration with listings is now available for accounts in U.S. government regions
  • GA: Suspend and resume tasks in Snowsight, can now suspend and resume your tasks directly in Snowsight
  • GA: Suspended time and reason descriptions, hover over any suspended label or icon to see the most recent time the task was suspended and if it was suspended manually or automatically due to failure
  • GA: Parameters on root tasks, displays the auto-suspend, auto-retry, and task timeout parameters in the task details page of the task graph’s root task
  • GA: Warehouses and Serverless Tasks, displays a Serverless Tasks icon and Serverless for the warehouse column for Serverless Tasks
  • GA: Task return values, assigned return values in the Task History
  • GA: Task run duration visualization, displays a bar-chart visualization of the duration of task runs
  • GA: Task run conditions, displays a condition column in your task list
  • GA: Task graph configurations, task graph configuration and the task definition on the task details page
  • GA: Task definition view in task graphs, inspect the definitions of each task

Streamlit Updates

  • GA: Custom sleep timer, set a custom sleep timer for a Streamlit app to auto-suspend by creating a config.toml configuration file and specifying the timer
  • GA: Streamlit in Snowflake supports GCP
  • GA: Support for v1.29.0 and v1.31.1 of the Streamlit library

Performance Updates

  • Improved object replication, Reduces the time spent in the SECONDARY_UPLOADING_INVENTORY and SECONDARY_DOWNLOADING_METADATA phases of a refresh operation by optimizing the synchronization of some objects and the authorization mechanism for replication operations
  • Reduced the latency for loading most Parquet files by up to 50% when the file format option, USE_VECTORIZED_SCANNER, is set to TRUE
  • Improved evaluation of aggregations so they are made at more intermediate join trees
  • Improved query execution times for queries that spend a significant amount of time communicating across virtual warehouse nodes
  • Improved top-k pruning for LIMIT and ORDER BY queries, reduces execution time for top-k queries due to fewer scanned files and file header reads
  • Improved join order decisions by calculating selectivity estimates with more granularity, reduces compilation time and query execution time by calculating selectivity estimates at the micro-partition level
  • Faster loading time for Python, improves performance for Streamlit in Snowflake apps (including Streamlit apps within a Snowflake Native App), Python worksheets, Python UDFs, and stored procedures in Python

Organization Updates

  • Preview: allows you to gain organization-level insights into the cost of using Snowflake, including: current contract, balance, accumulated cost of Snowflake usage since the start of the contract, monthly spend for org, consumption of each account

SQL Updates

  • GA: Email notification integrations no longer limited to 10, no more limits per account
  • GA: UNPIVOT supports rows with NULLs in results, use the { INCLUDE | EXCLUDE } NULLS option in an UNPIVOT subclause to specify whether to include rows with NULLs in the results
  • GA: Using the TABLE keyword as an alternative to SYSTEM$REFERENCE and SYSTEM$QUERY_REFERENCE, TABLE keyword to get a reference to a table, view, secure view, or query
  • Preview: CREATE OR ALTER TABLE and CREATE OR ALTER TASK, combine the functionality of the CREATE command and the ALTER command
  • General Availability: Dynamic pivot, Use the ANY keyword or a subquery in the PIVOT subclause instead of specifying the pivot values explicitly
  • General Availability: UDFs now support structured data types when created in Java, Python, and Scala
  • Preview: Jinja2 template support for EXECUTE IMMEDIATE FROM, generate and execute SQL scripts using a Jinja2 template file

Machine Learning Updates (Cortex and ML)

  • GA: Snowflake Model Registry, allows you to securely store, manage, and use ML models in Snowflake, and the registry supports the most popular types of Python ML models
  • GA: Cortex LLM Functions, instant access to industry-leading LLMs, including Snowflake Arctic, llama3, reka, mistral, gemma, e5, includes functions COMPLETE, EXTRACT_ANSWER, SENTIMENT, SUMMARIZE and TRANSLATE
  • Preview: New model for vector embedding, snowflake-arctic-embed-m model available for text embedding tasks
  • Preview: Document AI, intelligent document processing (IDP) workflows within Snowflake by extracting information from documents, such as invoices or contracts, and directly applying it to operational workflows — available on AWS and Azure
  • GA: Simpler SQL for storing results from ML functions, can now call the Forecast and Detect Anomalies ML Functions directly in the FROM clause of a SELECT statement
  • Preview: Snowflake ML Classification, new features for timestamp and high-cardinality features and labels, this changes new ML results even in the same training data

Data Clean Rooms Updates

  • GA: Tracing user activity in the web app, administrators can attribute activity in the web app to specific users
  • GA: Multi-provider clean rooms via Developer APIs, consumers can use the developer API to run an approved workload across multiple clean rooms, which lets them gain insights from datasets from more than one provider in the same analysis
  • GA: Additional supported regions, Europe (all clouds) and GCP, NA
  • GA: Support for views in the web app, providers and consumers can use the web app to link views, materialized views, and secure views
  • GA: Clean room customizations for identity & activation, providers can customize which activation, identity, and data provider partners display as options within a clean room
  • GA: Custom template enhancements, enhancements to the process of creating a user interface

Data Pipelines/Data Loading/Unloading Updates

  • General Availability: New Parquet file format option USE_VECTORIZED_SCANNER, new file format option

Open-Source Updates

Security Updates

  • Tri-Secret Secure self-registration, self-registration process for customer-managed keys (CMKs)

Iceberg Table Updates

  • GA: Replace invalid UTF-8 characters in Iceberg tables
  • GA: Structured type evolution
  • GA: Set a storage serialization policy
  • GA: Change ALLOW_WRITES to FALSE for external volumes
  • GA: New ICEBERG_ACCESS_ERRORS view

Client, Drivers, Libraries and Connectors Updates

New features:

  • Snowflake Connector for Google Analytics Raw Data 1.0.0 (procedure PUBLIC.UPDATE_CONNECTION allows re-authenticating a running connector, automatically re-enable their related Google Analytics properties for ingestion at reinstall)
  • Snowflake Connector for Google Analytics Raw Data 1.1.0 (behaviour, disabling property which is ingesting incremental intraday data will remove currently ingested day if ingestion was not fully completed)
  • Snowflake Connector for Google Analytics Raw Data 1.1.1 (the UPDATE_CONNECTION_CONFIGURATION procedure)
  • Snowflake Connector for Google Analytics Raw Data 1.2.0 (healthcheck task to all connector instances)
  • Snowflake Connector for ServiceNow V2 5.2.0 (ptional table_name and sys_id arguments to FINALIZE_CONNECTOR_CONFIGURATION to help in journal table validation)
  • Go Snowflake Driver 1.10.0 (support for structured types (structured objects, arrays, and maps, option to skip driver registration during startup, added the SECURITY.md file so customers can review Snowflake’s security policy, and ability to set custom logger fields)
  • Go Snowflake Driver 1.10.1 (Upgraded AWS SDK dependencies, automatic password masking in logs, DisableSamlURLCheck parameter to disable SAML URL checks, support for binding semi-structured types, decreased the number of retries to OCSP, OcspMaxRetryCount and OcspResponderTimeout variables to define the OCSP maximum retry count and timeout)
  • Ingest Java SDK 2.1.1 (more detailed error messages for the INVALID_CHANNEL error, support for external OAuth 2.0)
  • Node.js Driver 1.11.0 ( disableSamlURLCheck parameter to disable SAML URL checks, representNullAsStringNull configuration parameter to specify how the fetchAsString method returns null values. When disabled, fetchAsString returns null values as NULL instead of as the string, “NULL”, released Snowflake’s official d.ts type declaration file to support TypeScript users, removed the following unused dependencies: agent-base, debug, and extend)
  • ODBC Driver 3.3.1 (Updated the following library versions: arrow from 0.17.1 to 15.0.0, aws sdk from 1.3.50 to 1.11.283, curl from 8.6.0 to 8.7.1)
  • Snowpark ML 1.5.1 (new Model Registry features:log_model, get_model, and delete_model methods now support fully-qualified names, new modeling features: use an anonymous stored procedure during fitting, so that modeling does not require privileges to operate on the registry schema. Call import snowflake.ml.modeling.parameters.enable_anonymous_sproc to enable this feature)
  • Snowpark ML 1.5.0 (model Registry behavior changes: The fit_transform method can now return either a Snowpark DataFrame or a pandas DataFrame, matching the kind of DataFrame passed to the method, the fit_transform method can now return either a Snowpark DataFrame or a pandas DataFrame, matching the kind of DataFrame passed to the method)
  • Snowpark Library for Python 1.18.0 (support for DataFrame.pivot_table with no index parameter and with the margins parameter, signature of DataFrame.shift, Series.shift, DataFrameGroupBy.shift, and SeriesGroupBy.shift to match pandas 2.2.1. Snowpark pandas does not yet support the newly-added suffix argument or sequence values of periods, re-added support for Series.str.split, lots of local testing updates and bug fixes)
  • Snowpark Library for Python 1.17.0 (support to add a comment on tables and views using the functions listed below:DataFrameWriter.save_as_table,DataFrame.create_or_replace_view,DataFrame.create_or_replace_temp_view,DataFrame.create_or_replace_dynamic_table, lots of local testing updates and bug fixes)
  • Snowpark Library for Python 1.16.0 (snowflake.snowpark.Session.lineage.trace to explore data lineage of Snowflake objects, support for registering stored procedures with packages given as Python modules, support for structured type schema parsing)
  • Snowflake CLI 2.3.0 (added the --info option for the snow command to display the configured feature flags, added the -D/--variable option to the snow sql command to support variable substitutions in SQL input (client-side query templating), support for full-qualified stage names in snow stage and snow git execute commands, ability to specify files and directories as arguments for the snow app deploy command, new options to the snow app deploy command:--recursive to sync all files and subdirectories recursivel, --prune to delete specified files from the stage if they don’t exist locally, optimized the Snowpark dependency search to reduce the size of .zip artifacts and the number of Anaconda dependencies for Snowpark projects, improved error messages for a corrupted config.toml file)
  • JDBC Driver 3.16.1 (the disableSamlURLCheck parameter to disable SAML URL checks)

Bug fixes:

  • SnowSQL 1.3.0 (Behavior Change Release-change the SnowSQL 1.3.0 release disabled automatic upgrades, disabled automatic updates to fix an issue where expired S3 licenses caused SnowSQL to fail, issue where the lack of permission to create log directory aborted SnowSQL, issue that endpoint is not created correctly when connecting to China deployment)
  • Snowflake Connector for Google Analytics Raw Data 0.19.2 (issue with refreshing OAuth access tokens that was causing long-running ingestions to fail)
  • Snowflake Connector for Google Analytics Raw Data 1.0.0 (connector now have a fixed set of properties, mostly related to AUTOCOMMIT and date-time formats, required for these tasks to work correctly)
  • Snowflake Connector for Google Analytics Raw Data 1.0.1 (Dispatcher task was adjusted to never automatically suspend, Ingestion worker tasks have prolonged timeout to 6h hours. This will override account level parameter settings)
  • Snowflake Connector for Google Analytics Raw Data 1.1.0 (Fixed issue with Pausing/Resuming Connector which left Connector state in intermediate state PAUSING/ STARTING)
  • Snowflake Connector for Google Analytics Aggregate Data 1.0.1 (connector could enter an inconsistent state during pausing or resuming)
  • Snowflake Connector for ServiceNow V2 5.2.0 (Improve URL validation in SET_CONNECTION_CONFIGURATION to support custom ServiceNow® domains)
  • Snowflake Connector for ServiceNow V2 5.3.0 (Fix handling the null value of the journal_table property in the object passed to the FINALIZE_CONNECTOR_CONFIGURATION procedure. The journal_table parameter can now also be skipped)
  • Go Snowflake Driver 1.10.0 (closing the error channel twice when using async mode, race condition when accessing temporal credentials)
  • Go Snowflake Driver 1.10.1 (exposed objects in Arrow batches mode, extracting account names when using key-pair authentication)
  • Ingest Java SDK 2.1.1 (upgraded several dependencies, including vulnerability fixes, issue where HTTP connections are leaked due to error responses, relaxed the file size constraints to deal with issues where longer client flush lags produce larger files)
  • Snowpark ML 1.5.0 (Model Registry bug fixes: fixed the “invalid parameter SHOW_MODEL_DETAILS_IN_SHOW_VERSIONS_IN_MODEL” error)
  • Snowpark ML 1.5.1 (Model registry bug fixes: issue with loading older models)
  • Snowpark Library for Python 1.18.0 (mixed columns for string methods (Series.str.*))
  • Snowpark Library for Python 1.16.0 (bug where, when inferring a schema, single quotes were added to stage files that already had single quotes)
  • Snowflake Connector for Python 3.10.1 (incorrect error log message that could occur during arrow data conversion)
  • Snowflake CLI 2.4.0 (Added the --cascade option to snow app teardown command that automatically drops all application objects owned by an application, added external access integration to snow object commands, added aliases for snow object list, describe, and drop commands for the following:snow stage for stages,snow git for git repository stages,snow streamlit for Streamlit apps,snow snowpark for Snowpark Python procedures and functions,snow spcs compute-pool for compute pools,snow spcs image-repository for image repositories,snow spcs service for services, added the following support to the snow sql command: works with the snowflake.yml file. The variables defined in the new env section of snowflake.yml can be used to expand templates, allows executing queries from multiple files by specifying multiple -f/--file options, added support for passing input variables to the snow git execute and snow stage execute commands, added the following snow cortex commands to support Snowflake Cortex:complete: Generates a response to a question using your choice of language model,extract-answer: Extracts an answer to a given question from a text document,sentiment: Returns a sentiment score for the given English-language input text,summarize: Summarizes the given English-language input text, translate: Translates text from the indicated or detected source language to a target language, added tab-completion for snow commands, dded the following improvements: executing the snow command with no arguments or options now automatically displays the command-line help (as in snow --help), improved support for quoted identifiers)
  • Snowflake CLI 2.3.1 (bugs in the source artifact mapping logic for Snowflake Native Apps)
  • Snowflake CLI 2.3.0 (issue with the snow app commands that cause files to be re-uploaded unnecessarily, issue where the snow app run command did not upgrade an application when the local state and remote stage are identical, issue with handling the stage pat separators on Windows)
  • Snowflake Connector for Kafka 2.2.2 (issue where the staged files are not cleaned up properly)
  • Node.js Driver 1.11.0 (issue with millisecond precision, issue with creating paths on Windows when using the PUT command)
  • JDBC Driver 3.16.1 (issue with choosing S3 regional URL domain base on the region name, issue related to nested paths in Windows when parsing client configurations, issue with the getObject method for arrays in JSON, fixed a casting issue with a MapVector)
  • Snowpark Library for Scala and Java 1.12.1 (fixed “Dataframe alias doesn’t work in the JOIN condition”)

Conclusion

Check out how many features arrived at General Availability and continues to extend everywhere. Clearly the focus was on making everything in Snowsight and many of the Cortex and Snowflake ML into everyone hands, including vector data type. Things of notes is Trust Center and Triggered tasks. I am looking forward to writing about some these.

I hope you continue to enjoy these articles.

I am Augusto Rosa, VP of Engineering for Infostrux Solutions. Snowflake Data Super Hero and SME. Thanks for reading my blog post. You can follow me on LinkedIn.

Subscribe to Infostrux Medium Blogs https://medium.com/infostrux-solutions for the most interesting Data Engineering and Snowflake news.

Sources:


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .