Snowflake Summit 2024 After-action Report and Workshopping: Biz x Dev Roundtable
The Biz x Dev Roundtable series will provide solutions to data-related problems and trending topics in the form of a conversation between software engineers and business professionals.
Today's Attendees.
Hayato Onodera - Sales Manager @Morph
Manager of Corporate Sales at Morph.
Naoto Shibata - CEO/Backend Engineer @Morph
He is our CEO, but also the lead backend engineer at Morph.
Onodera:
The Snowflake Summit 2024 was recently held, and there were many announcements related to AI.
Today, I'd like to discuss how to actually use Snowflake's Cortex in our work.
Was there anything from the Snowflake Summit 2024 that caught your attention?
Shibata:
First of all, I’m happy about dark mode!
And of course, there were many AI-related announcements, and they strongly emphasized the concept of the "AI Data Cloud."
Onodera:
I had used the Notebook before, so I found the announcements easy to understand, but there were many features that made me think, "We can do this now."
Shibata:
It's great that we can use AI features without having to build a local environment. We can do a lot on Snowflake.
Regarding the Notebook, it’s nice that we can execute Streamlit commands, allowing us to do everything from AI processing to simple applications with one tool.
Onodera:
Seeing those announcements inspired me, and today I'd like to discuss while actually using Cortex.
Cleansing Sales Data Using LLMs
Onodera:
I’ve prepared some E-commerce data. Looking at the product information, it mixes physical products and fan club payments.
This is problematic for analysis, so I used LLM to classify the categories.
Here is the actual code.
Shibata:
You can call LLM within the SELECT statement.
Onodera:
That's right. The results were good, and I think it can be used in actual customer projects.
Shibata:
Previously, we would have to query with SQL, call an API like ChatGPT from Python to process it, and then return it to Snowflake, so it simplifies things a lot.
Onodera:
As someone from the business side, I’m familiar with SQL, so it’s easy to handle.
But I wanted to ask, even with such tools, are there still cases where you need to use Python?
Shibata:
With this much functionality, many cases are covered.
However, the advantage of using Python is that you can use the latest models immediately. With an integrated tool solution, you have to wait until the tool, in this case Snowflake, supports it. Using Python to directly access the API gives you more flexibility.
Also, in practice, tasks are rarely completed with just LLM processing, so there is still strength in writing code.
I’m curious how much Cortex can handle commands like Function Calling, which restricts the output format of LLM.
In this case, we’re only trying it with data output from the EC site, but there might be situations where you want to combine it with other data stored in Excel, for example.
Onodera:
That's true. In actual projects, it seems like Python processing will also be necessary.
Scoring Salesforce Prospective Customers
Onodera:
Another dataset I brought is Salesforce prospective customer data. Scoring prospective customers is still heavily reliant on qualitative information, so I want to label it more quantitatively.
There are many methods for customer scoring that focus on quantitative data, but it’s another matter to build it myself. I thought it might be easy to do with Cortex’s features, so I tried it out.
What I did was build a model from the data of customers who actually purchased in the past and then use it to score prospective customers.
Shibata:
You can do this with SQL too.
Shibata:
It’s sufficient at least as a prototype. You can improve it while testing with actual data.
Onodera:
It only took about five minutes to build.
With the enhanced integration between Salesforce and Snowflake, I feel the possibilities have expanded.
(cited from https://www.snowflake.com/blog/bi-directional-data-sharing-snowflake-salesforce-ga/)
Shibata:
The Zero Copy integration, right? It’s really convenient.
As these developments progress, the hurdles for data integration will be lowered. I think the next step will be focusing on data utilization using these tools.
Onodera:
Actual customer data isn’t always managed only in CRM; there might be offline event application data in spreadsheets or inquiry data stored in the cloud. Lowering the hurdles for data integration could enable more data-driven sales and marketing.
Shibata:
For example, companies with multiple products could perform comprehensive data analysis for upselling and cross-selling.
Having good data could become a competitive advantage in such a trend.
Supporting the Creation of ‘Good Data’
Onodera:
Considering this, it would be great if Morph could support the creation of ‘good data.’
Shibata:
Exactly. We want to cover the creation of ‘good data’ and preprocessing for data analysis using an LLM.
In data analysis, with the use of both machine learning and LLMs, the requirements for data preprocessing has changed. I think it would be ideal if Morph became complementary to a tool such as Snowflake.