Snowflake Summit 2023: Harnessing LLMs & Doc AI for Industry Change

Large language models are making waves across all industries and Document AI is becoming common as organizations look to unlock even greater business potential. But where do you begin? (Hint: Strategy is essential, and 2nd Watch has been working through the implications of LLMs and Document AI for more than a year to help you navigate through the hype.)

Beyond the continued splash of LLM and Document AI discussions, this year’s Snowflake Summit focused on a couple of practical but still substantial announcements: an embrace of open source (both in applications and in AI/LLM models) and – maybe most impactful in the long run – the native first-party Microsoft Azure integration and expanded partnership. I’ll start there and work backwards to fully set the stage before digging into what some of the transformative LLM and Document AI use cases actually are across industries and sharing which use cases are trending to have the greatest and most immediate impact according to participants in 2nd Watch’s LLM industry use case battle, which ran through Snowflake Summit.

snowflake summit

Snowflake + Microsoft Azure: Simplifying Integration and Enabling Native Snowflake Apps

The Snowflake and Microsoft Azure integration and expanded partnership is a big deal. Snowflake and Azure have paved the path for their customers, freeing them up from making difficult integration decisions.

For 2nd Watch, as a leader working with both Microsoft and Snowflake since as early as 2015, seeing a roadmap that integrates Snowflake with Azure’s core data services immediately brought to mind a customer value prop that will drive real and immediate decisions throughout enterprises. With a stronger partnership, Azure customers will reap benefits from both a technology standpoint and an overall go-to-market effort between the two organizations, from data governance via Azure Purview to AI via Cognitive Services.

Running your workloads where you want, how you want, has always been a key vision of Snowflake’s long-term roadmap, especially since the introduction of Snowpark. While the Microsoft announcement expanded on that roadmap, Snowflake continued to push even further with performance upgrades and new features for both Snowpark and Apache Iceberg (allowing for data to be stored as parquet files in your storage buckets). Customers will be able to build and run applications and AI models in containers, natively on Snowflake, whether that’s using Streamlit, built using Snowflake’s Native App Framework, or all the above. With all your data in a centralized place and Apache Iceberg allowing for portability, there’s a compelling reason to consider building and deploying more apps directly in Snowflake, thereby avoiding the need to sync data, buy middleware, or build custom integrations between apps.

Snowflake + NVIDIA: Embracing Open Source for AI and LLM Modeling

Another major theme throughout Summit was an embrace of openness and open source. One of the first major cornerstones of the event was the announcement of NVIDIA and Snowflake’s partnership, an integration that unlocks the ability for customers to leverage open-source models.

What does this mean for you? This integration opens up the ability to both run and train your own AI and LLM models directly where your data lives – ensuring both privacy and security as the data no longer needs to be pushed to an external, third-party API. From custom Document AI models to open-source, fine-tuned LLMs, the ability to take advantage of NVIDIA’s GPU cloud reduces the latency both in training/feedback loops and use in document and embedding-based retrieval (such as document question answering across vast amounts of data).

Document AI: Introducing Snowflake’s Native Features

The 2nd Watch team was excited to see how spot on our 2023 data and AI predictions were, as we even went so far as to feature Document AI in our exhibit booth design and hosted an LLM industry use case battle during expo hours. Document AI will be key to transformative industry use cases in insurance, private equity, legal, manufacturing – you name it. From contract analysis and risk modeling to competitive intelligence and marketing personalization, Document AI can have far-reaching impacts; and Snowflake is primed to be a major player in the Document AI space. 

Many organizations are just beginning to identify use cases for their AI and LLM workloads, but we’ve already spent the past year combining our existing offerings of Document AI with LLM capabilities. (This was the starting point of our previously mentioned industry use case battle, which we’ll discuss in more detail below.) With Snowflake’s announcement of native Document AI features, organizations now have the ability to tap into valuable unstructured data that’s been sitting across content management systems, largely unused, due to the incredibly costly and time-consuming efforts it takes to manually parse or extract data from documents – particularly when the formats or templates differ across documents.

Snowflake’s Document AI capabilities allow organizations to extract structured data from PDFs via natural language and, by combining what is likely a Vision transformer with an LLM, build automations to do this at scale. The data labeling process is by far the most crucial step in every AI workload. If your model doesn’t have enough high-quality examples, it will produce the same result in automated workloads. Third-party software products, such as SnorkelAI, allow for automated data labeling by using your existing data, but one of the key findings in nearly every AI-related research paper is the same: high-quality data is what matters, and the efforts you put in to building that source of truth will result in exponential benefits downstream via Document AI, LLMs, and other data-centric applications.

Leveraging Snowflake’s Data Cloud, the end-to-end process can be managed entirely within Snowflake, streamlining governance and privacy capabilities for mitigating the risk of both current and future regulations across the globe, particularly when it comes to assessing what’s in the training data you feed into your AI models.

Retrieval Augmented Generation: Exploring Those Transformative Industry Use Cases

It’s likely become clear how widely applicable Document AI and retrieval augmented generation are. (Retrieval augmented generation, or RAG: retrieving data from various sources, including image processors, auto-generated SQL, documents, etc., to augment your prompts.) But to show how great of an impact they can have on your organization’s ability to harness the full bulk and depth of your data, let’s talk through specific use cases across a selection of industries.

AI for Insurance

According to 2nd Watch’s LLM industry use case battle, contract analytics (particularly in reinsurance) reigned supreme as the most impactful use case. Unsurprisingly, policy and quote insights also stayed toward the top, followed by personalized carrier and product recommendations.

Insurance organizations can utilize both Document AI and LLMs to capture key details from different carriers and products, generating personalized insurance policies while understanding pricing trends. LLMs can also alert policy admins or automate administration tasks, such as renewals, changes, and cancellations. These alerts can allow for human-in-the-loop feedback and review, and feed into workflow and process improvement initiatives.

AI in Private Equity Firms

In the private equity sector, firms can leverage Document AI and question-answering features to securely analyze their financial and research documents. This “research analyst co-pilot” can answer queries across all documents and structured data in one place, enabling analysts to make informed decisions rapidly. Plus, private equity firms can use LLMs to analyze company reports, financial and operational data, and market trends for M&A due diligence and portfolio company benchmarking.

However, according to the opinions shared by Snowflake Summit attendees who stopped by our exhibit booth, benchmarking is the least interesting application of AI in private equity, with its ranking dropping throughout the event. Instead, Document AI question answering was the top-ranked use case, with AI-assisted opportunity and deal sourcing coming in second.

Legal Industry LLM Insights

Like both insurance and private equity, the legal industry can benefit from LLM document review and analysis; and this was the highest-ranked LLM use case within legal. Insights from complex legal documents, contracts, and court filings can be stored as embeddings in a vector database for retrieval and comparison, helping to speed up the review process and reduce the workload on legal professionals.

Case law research made a big comeback in our LLM battle, coming from sixth position to briefly rest in second and finally land in third place, behind talent acquisition and HR analytics. Of course, those LLM applications are not unique to law firms and legal departments, so it comes as no surprise that they rank highly.

Manufacturing AI Use Cases

Manufacturers proved to have widely ranging opinions on the most impactful LLM use cases, with rankings swinging wildly throughout Snowflake Summit. Predictive maintenance did hold on to the number one spot, as LLMs can analyze machine logs and maintenance records, identify similar past instances, and incorporate historical machine performance metrics to enable a predictive maintenance system. 

Otherwise, use cases like brand perception insights, quality control checks, and advanced customer segmentation repeatedly swapped positions. Ultimately, competitive intelligence landed in a tie with supply chain optimization and demand forecasting. Gleaning insights from unstructured data within sources like news articles, social media, and company reports, and coupled with structured data like factual market statistics and company performance data, LLMs can produce well-rounded competitive intelligence outputs. It’s no wonder this use case tied with supply chain and demand forecasting – in which LLMs analyze supply chain data and imaging at ports and other supply chain hubs for potential risks, then combining that data with traditional time-series demand forecasting for optimization opportunities. Both use cases focus on how manufacturers can optimally position themselves for an advantage within the market.

Even More LLM Use Cases

Not to belabor the point, but Document AI and LLM have such broad applications across industries that we had to call out several more:

  • Regulatory and Risk Compliance: LLMs can help monitor and ensure compliance with financial regulations. These compliance checks can be stored as embeddings in a vector database for auditing and internal insights.
  • Copyright Violation Detection: LLMs can analyze media content for potential copyright violations, allowing for automated retrieval of similar instances or known copyrighted material and flagging.
  • Personalized Healthcare: LLMs can analyze patient symptoms and medical histories from unstructured data and EHRs, the latest medical research and findings, and patient health records, enabling more effective treatment plans.
  • Medical Imaging Analysis: Use LLMs to help interpret medical imaging, alongside diagnoses, treatment plans, and medical history, allowing for patient imaging to suggest potential diagnoses and drug therapies based on the latest research and historical data.
  • Automated Content Tagging: Multimodal models and LLMs can analyze media content across video, audio, and text to generate relevant tags and keywords for automated content classification, search, and discovery.
  • Brand Perception Insights: LLMs can analyze social media and online reviews to assess brand perception.
  • Customer Support Copilots: LLMs can function as chatbots and copilots for customer service representatives, enabling customers to ask questions, upload photos of products, and allow the CSR to quickly retrieve relevant information, such as product manuals, warranty information, or other internal knowledge base data that is typically retrieved manually. By storing past customer interactions in a vector database, the system can retrieve relevant solutions based on similarity and improve over time, making the CSR more effective and creating a better customer experience.

More broadly, LLMs can be utilized to analyze company reports, research documents, news articles, financial data, and market trends, storing these relationships natively in Snowflake, side-by-side with structured data warehouse data and unstructured documents, images, or audio. 

Snowflake Summit 2023 ended with the same clear focus that I’ve always found most compelling within their platform – giving customers simplicity, flexibility, and choice for running their data-centric workloads. That’s now been expanded to Microsoft, to the open-source community, to unstructured data and documents, and to AI and LLMs. Across every single industry, there’s a practical workload that can be applied today to solve high-value, complex business problems.

I was struck by not only the major (and pleasantly unexpected) announcements and partnerships, but also the magnitude of the event itself. Some of the most innovative minds in the data ecosystem came together to engage in curiosity-driven conversation, sharing what they’re working on, what’s worked, and what hasn’t worked. And that last part – especially as we continue to push forward on the frontier of LLMs – is what made the week so compelling and memorable.

With 2nd Watch’s experience, research, and findings in these new workloads, combined with our history working with Snowflake, we look forward to having more discussions like those we held throughout Summit to help identify and solve long-standing business problems in new, innovative ways. If you’d like to talk through Document AI and LLM use cases specific to your organization, please get in touch.


Data and AI Predictions in 2023

As we reveal our data and AI predictions for 2023, join us at 2nd Watch to stay ahead of the curve and propel your business towards innovation and success. How do we know that artificial intelligence (AI) and large language models (LLMs) have reached a tipping point? It was the hot topic at most families’ dinner tables during the 2022 holiday break.

Modern data management: Comparing modern data warehouse options

AI has become mainstream and accessible. Most notably, OpenAI’s ChatGPT took the internet by storm, so much so that even our parents (and grandparents!) are talking about it. Since AI is here to stay beyond the Christmas Eve dinner discussion, we put together a list of 2023 predictions we expect to see regarding AI and data.

1. Proactively handling data privacy regulations will become a top priority.

Regulatory changes can have a significant impact on how organizations handle data privacy: businesses must adapt to new policies to ensure their data is secure. Modifications to regulatory policies require governance and compliance teams to understand data within their company and the ways in which it is being accessed. 

To stay ahead of regulatory changes, organizations will need to prioritize their data governance strategies. This will mitigate the risks surrounding data privacy and potential regulations. As a part of their data governance strategy, data privacy and compliance teams must increase their usage of privacy, security, and compliance analytics to proactively understand how data is being accessed within the company and how it’s being classified. 

2. AI and LLMs will require organizations to consider their AI strategy.

The rise of AI and LLM technologies will require businesses to adopt a broad AI strategy. AI and LLMs will open opportunities in automation, efficiency, and knowledge distillation. But, as the saying goes, “With great power comes great responsibility.” 

There is disruption and risk that comes with implementing AI and LLMs, and organizations must respond with a people- and process-oriented AI strategy. As more AI tools and start-ups crop up, companies should consider how to thoughtfully approach the disruptions that will be felt in almost every industry. Rather than being reactive to new and foreign territory, businesses should aim to educate, create guidelines, and identify ways to leverage the technology. 

Moreover, without a well-thought-out AI roadmap, enterprises will find themselves technologically plateauing, teams unable to adapt to a new landscape, and lacking a return on investment: they won’t be able to scale or support the initiatives that they put in place. Poor road mapping will lead to siloed and fragmented projects that don’t contribute to a cohesive AI ecosystem.

3. AI technologies, like Document AI (or information extraction), will be crucial to tap into unstructured data.

According to IDC, 80% of the world’s data will be unstructured by 2025, and 90% of this unstructured data is never analyzed. Integrating unstructured and structured data opens up new use cases for organizational insights and knowledge mining.

Massive amounts of unstructured data – such as Word and PDF documents – have historically been a largely untapped data source for data warehouses and downstream analytics. New deep learning technologies, like Document AI, have addressed this issue and are more widely accessible. Document AI can extract previously unused data from PDF and Word documents, ranging from insurance policies to legal contracts to clinical research to financial statements. Additionally, vision and audio AI unlocks real-time video transcription insights and search, image classification, and call center insights.

Organizations can unlock brand-new use cases by integrating with existing data warehouses. Finetuning these models on domain data enables general-purpose models across a wide variety of use cases. 

4. Data is the new oil.

Data will become the fuel for turning general-purpose AI models into domain-specific, task-specific engines for automation, information extraction, and information generation. Snorkel AI coined the term “data-centric AI,” which is an accurate paradigm to describe our current AI lifecycle. The last time AI received this much hype; the focus was on building new models. Now, very few businesses need to develop novel models and algorithms. What will set their AI technologies apart is the data strategy.

Data-centric AI enables us to leverage existing models that have already been calibrated to an organization’s data. Applying an enterprise’s data to this new paradigm will accelerate a company’s time to market, especially those who have modernized their data and analytics platforms and data warehouses

5. The popularity of data-driven apps will increase.

Snowflake recently acquired Streamlit, which makes application development more accessible to data engineers. Additionally, Snowflake introduced Unistore and hybrid tables (OLTP) to allow data science and app teams to work together and jointly off of a single source of truth in Snowflake, eliminating silos and data replication.

Snowflake’s big moves demonstrate that companies are looking to fill gaps that traditional business intelligence (BI) tools leave behind. With tools like Streamlit, teams can harness tools to automate data sharing and deployment, which is traditionally manual and Excel-driven. Most importantly, Streamlit can become the conduit that allows business users to work directly with the AI-native and data-driven applications across the enterprise.

6. AI-native and cloud-native applications will win.

Customers will start expecting AI capabilities to be embedded into cloud-native applications. Harnessing domain-specific data, companies should prioritize building upon module data-driven application blocks with AI and machine learning. AI-native applications will win over AI-retrofitted applications. 

When applications are custom-built for AI, analytics, and data, they are more accessible to data and AI teams, enabling business users to interact with models and data warehouses in a new way. Teams can begin classifying and labeling data in a centralized, data-driven way, rather than manually and often-repeated in Excel, and can feed into a human-in-the-loop system for review and to improve the overall accuracy and quality of models. Traditional BI tools like dashboards, on the other hand, often limit business users to consume and view data in a “what happened?” manner, rather than in a more interactive, often more targeted manner.

7. There will be technology disruption and market consolidation.

The AI race has begun. Microsoft’s strategic partnership with OpenAI and integration into “everything,” Google’s introduction of Bard and funding into foundational model startup Anthropic, AWS with their own native models and partnership with Stability AI, and new AI-related startups are just a few of the major signals that the market is changing. The emerging AI technologies are driving market consolidation: smaller companies are being acquired by incumbent companies to take advantage of the developing technologies. 

Mergers and acquisitions are key growth drivers, with larger enterprises leveraging their existing resources to acquire smaller, nimbler players to expand their reach in the market. This emphasizes the importance of data, AI, and application strategy. Organizations must stay agile and quickly consolidate data across new portfolios of companies. 


The AI ball is rolling. At this point, you’ve probably dabbled with AI or engaged in high-level conversations about its implications. The next step in the AI adoption process is to actually integrate AI into your work and understand the changes (and challenges) it will bring. We hope that our data and AI predictions for 2023 prime you for the ways it can have an impact on your processes and people.

Why choose 2nd Watch?

Choose 2nd Watch as your partner and let us empower you to harness the power of AI and data to propel your business forward.

  • Expertise: With years of experience in cloud optimization and data analytics, we have the expertise to guide you through the complexities of AI implementation and maximize the value of your data.
  • Comprehensive Solutions: Our range of services covers every aspect of your AI and data journey, from cost analysis and optimization to AI strategy development and implementation. We offer end-to-end solutions tailored to your specific needs.
  • Proven Track Record: Our track record speaks for itself. We have helped numerous organizations across various industries achieve significant cost savings, improve efficiency, and drive innovation through AI and data-driven strategies.
  • Thoughtful Approach: We understand that implementing AI and data solutions requires a thoughtful and strategic approach. We work closely with you to understand your unique business challenges and goals, ensuring that our solutions align with your vision.
  • Continuous Support: Our commitment to your success doesn’t end with the implementation. We provide ongoing support and monitoring to ensure that your AI and data initiatives continue to deliver results and stay ahead of the curve.

Contact us now to get started on your journey towards transformation and success.