MARDREAMIN’ SUMMIT 2025
MAY 7-8, 2025 IN ATLANTA - GA

Days
Hours
Minutes
Seconds
🎉 The Event Is Live! 🎉

NOW PLAYING

View the session live or catch the replay here. You’ll find the recording and all related resources on this page once available.

Looking for the Chat?

Our live discussions are happening over in Slack. That’s where you can connect with speakers, join session threads, and chat with other attendees in real time.

Hacking Future 101: Creative Ways to Transform Lead Generation with Machine Learning

Machine learning is all the rage in B2B marketing. That’s because it unlocks opportunities to connect with clients and prospects in the right place and at the perfect time.

In this session, you’ll begin your machine learning journey by learning how to:

Use data and machine learning to increase the impact of your marketing programs.
Segment your customers in conditions of scattered or missing data.
Improve customer targeting and campaign response with machine learning capabilities.
Leverage Natural Language Processing to boost your email open rates.
Use machine learning to create personalized content recommendation for email campaigns.

Sujata Karan
Wärtsilä Corporation

Sujata

Karan

Marketing Performance & Analytics Manager
Wartsila Corporation

Alexandra

Novikova

Marketing Development Manager

Keep The Momentum Going

Episode 1 – Dear Marketing, Signed—Sales: ABM Edition

Video Transcript

Speaker 0: Hello, everybody. Uh, we are live. This is Hacking Future 101: Creative Ways to Transform Lead Generation with Machine Learning. Uh, we have Alexandra Novikova and Sujata Karan here today, and they are going to be presenting the session. Um, so take it away.

Speaker 1: Great. Thanks, Amber. So hello, everyone. Good morning, good afternoon, and good evening. Thank you for joining us today. So today, we are going to talk about how we can transform the marketing and lead generation processes with machine learning algorithms. But before we move on to the topic, I’d like to introduce myself. I’m Sujata Karan. I am working with, uh, Finnish industrial conglomerate, Wärtsilä, which is a global leader in smart technologies and complete life cycle solution for marine and energy markets. I work here as Marketing Analytics and Performance Manager, and I have overall ten years of experience in the area of analytics and insights. So I have worked with companies like Colgate, Palmolive, Tata Consulting Services, etcetera. Now I would like, uh, to invite Alexandra, my colleague, to introduce herself, and then we can proceed with the topic. Alexandra.

Speaker 2: Thank Thank you, Sujata. Hi, everyone. My name is Alexandra Novikova, and I’m a Marketing Development Manager in Wärtsilä. I have extensive background in marketing and IT and a passion of developing digital marketing projects. Recently, I have taken an interest in machine learning and how it can push digital marketing to the next level. Thank you.

Speaker 1: Thanks, Alex. So moving on to the next slide, which is where we are going to talk about who we are, who Wärtsilä is. So Wärtsilä is, uh, as I said, it’s a it’s a Finnish conglomerate into marine and energy markets, and it’s a €4.6 billion company. It employs around 18,000 people across the globe, and it is present in 70 countries with 200 locations. And we are a diverse company, so we have around one forty nationalities who are working with Wärtsilä.

So moving on to the topic of today, which is, uh, how to utilize machine learning to improvise the marketing efficiencies, and these are the topics that we are going to cover. The first is sorry, first is using the data and machine learning to increase the impact of the marketing programs. Then, uh, I’ll touch base upon segmenting the customer in condition of scattered or missing data. Then I’ll be also be talking about how to improvise customer targeting and campaign response with machine learning capabilities. Then Alexandra will be talking about leveraging natural language processing to boost the email open rates and using machine learning to create personalized content recommendation for email campaigns. So let’s get right to it.

So first, some of the marketing areas that can be improved with machine learning. First, recommendation system, target audience prediction, personalization with NLP, and SEO automation and optimization. So recommendation systems is basically kind of, um, let’s say salesman who know based on your history and preferences what you like. So, basically, it uses, um, users’ behavior data, historical purchases, interest, activity data to predict preferred items to buy. As a business, personalized recommendation can achieve greater customer engagement and consumption rates while boosting ROI significantly, and that’s what we strive for, the ROI. And then, uh, machine learning-based predictive, uh, recommendation can enable the B2B marketers to provide the content and product suggestion that is unique to that particular account’s preference, and recommendation can be dynamically generated based on the known account data, account scores, analysis of visitor behavior, and preferences. So that’s the first use case that we have. And then, um, there’s also target group prediction. Well, are we able to efficiently use our budget to retarget the people who are more likely to respond to our campaigns? I would say no. Not very optimally. So how do we achieve it? Um, based on the web data historical responses from the campaigns, we can create a segment of audience who has higher propensity to respond to our campaigns, and we can retarget those context. How do we do that? We can train the model to those targets from our CRM system by applying logistic regression algorithm. I’ll be talking about more on this in the later part of this session. Then coming to personalization with NLP. Personalization is a buzzword that we have, and we can apply natural language processing to understand individual customers’ metadata, which my colleague, Alexandra, would be talking about in later part of this session. Then comes SEO automation and optimization. This is also part of NLP, and we can optimize and discover keywords with the help of NLP. And it can create real time targeted content by taking online behavioral data, search, its history, etcetera. So I have listed four use cases here, but this is not the only that that we can achieve, and this is not an exhaustive list.

Now moving on further to see how we can, uh, improvise scattered or missing data. But before that, what all data points that we have, what kind of data that we have: demographic data, geographic data such as location, behavioral data, how people are behaving on our website. Then comes, uh, psychographic data, which is basically values and interests, which this kind of data might not be available all the time. And then comes the customer journey data, whether the customer has, uh, you know, interacted with some of our webinars, some of our articles, white papers, etcetera. Next, continuing on the scattered or missing data. We know database is data is never perfect. So what do we do? We take an audit of, uh, stock of the data, and we ensure the data is correct, clean, complete, properly formatted, and verified. And how do we do that? By removing, uh, clutter from the data such as duplicate data, irrelevant data, redundant data, or low-quality data. But to do so, what is the first and proactive step that we should be taking? We should be knowing and stat evaluating and standardizing the CRM data points that we have and improve data collection methodologies by ensuring that these are the forms that is accurate, such as customer-facing data input form, employee-facing data input forms, third-party customer list imports, and integrations.

So enough about scattered or missing data. Let’s get to the use case that I have been working on, which is improving targeting and campaign response with machine learning capabilities. So what if I tell you that with machine learning capabilities, I can have a list of contacts which I which are at least 60% probable that these contacts are going to respond to my campaign, And I can be at least 75% sure that, okay, these contacts are going to respond to my campaign. How do we do that? We do it by following these steps. First, we do exploratory data analysis, then we do some outlier detection, feature engineering and feature selection, and lastly, data modeling. Alright. So exploratory data analysis is first and most basic part of any any machine learning algorithm that can exist. So and this is this is this is where we set our base. So we try to find missing data fields. Any data fields which has, let’s say, more than 3% of missing values, we will not consider those fields in our algorithm. Then comes the discrimination ability of each field. By discrimination ability, it basically tells us how well the category, in this case, account record type, line of business, market segment, these are the these are the fields that is coming from my industry. So how well these category correlate to my campaign responses? So once we have that, we have to ensure that at least all these fields have at least 40 observation and minimum discrimination ability to 15%. So once we have done this part, then we try to find correlation metrics metrics of discrete variable. So these fields that we have, account record type, um, account status, line of business, mark and segment, and so and so forth, if they are correlated internally and highly correlated, then we might not consider those fields because it will not actually do any good thing good to our machine learning algorithm. It will just increase the work by performing more analysis techniques. So we can just ignore some of the, uh, some of the fields which are not relevant. And then lastly, in the exploratory data analysis, we also do value distribution according to campaign response. So in this case, I have taken four examples, account record type, line of business, market segment, and contact department. And in the x axis, what you see one and zero, one mean meaning the contact has responded to the campaign, zero meaning they have not responded, and how these campaign responses contacts are spread across, let’s say, our account record type or line of business or market segment or contact department. So once we are done with exploratory data analysis, we try to do outlier detection. Outlier detection is a very useful and powerful tool, and we should be doing it because uh, outliers are basically those data points which do not fall fall into the trend. So outliers can exist. There could be data entry errors, measurement errors, uh, some experimental errors, intentional error, basically, intentionally putting some outliers to ensure that our outline detection methodology is working properly. And then comes data processing errors and then some sampling errors. Once we have all these set up, we move on to our feature engineering and feature selection. So feature engineering and feature selection is the process of selecting, uh, the most relevant features that have the impact on our machine learning algorithms, and we construct our models based on those features. So how do we, uh, how do we decide that which features are most worthy to build our machine learning algorithm? We decide it by running a chi-square test and see that which of these fields have the highest impact on our campaign responses. So in this case, let’s say account record type, account status, line of business are the most important features that we have. Once we have decided the the features or the fields, then we move on to the data modeling part and try to do try to see if the data is balanced or not. If the data is balanced, then we can just go ahead and run the logistic regression. If it is not, then we should be doing, uh, oversampling we we should be applying some oversampling techniques so that the data is balanced between campaign responses, yes, and campaign responses, no. Once we have the data balanced, then we can, uh, actually apply the logistic regression, and it will give me a list of the contact IDs, uh, with their probability and accuracy to respond to the campaign. So how do I read this? I read it such as, uh, this contact ID 2626 has at least 67% probability that they will respond to the to the campaign, and I’m at least 65% sure that this particular contact is going to respond to my campaign. So that’s pretty much it about the the use case that I’m working on, and I think I will invite Alexandra now to talk on the natural language processing part. Thank you.

Speaker 2: Thank you, Sujata. Let’s move on to the second part of our project where you will gain understanding on how to leverage natural language processing (NLP) to boost your email open rates. But before we dive right in, let’s have a quick overview of what NLP actually is and how you can apply it in your daily marketing activities. So what’s all the buzz is about? In short, NLP is an area of artificial intelligence that focuses on understanding, analyzing, editing, and generation of human language by a computer. So in other words, it enables machines to make sense of our language so that it will be able to perform different tasks with it. Probably one of the most famous examples that you’ve all heard about of NLP are virtual assistants like Alexa or Siri. But, of course, not everything done by NLP has to be connected to a gadget. Another example that can be run without humans, automatic text prediction when you write your emails or even Instagram stories translation. Possibilities of NLP usage are literally endless, and it’s just getting started. So how does NLP work? Basically, what it does, it takes an unstructured data and turns it into a structured format by using utilizing different algorithms, both supervised and unsupervised ones. Uh, one of the most common ones perhaps is, for example, named entity [recognition]. What it does is, um, looks through the text and finds patterns within that text and then groups those patterns either into predefined categories or it suggests categories on its own. This is especially valuable in any situation when you need to find meaningful information in a large amount of unstructured text data. And as you can see, their categories can be different ones. Um, they can be name, date, character, you name it. And there are plenty of other algorithms within NLP for various use case scenarios, such as sentiment analysis, topic modeling, tax classifications, and so on and so forth. But it is out of scope of our current presentation. And in the next slide, just to get you a little bit more excited about this topic, I want to present you with more amazing possibilities that you can do with NLP. Again, I’m not going to go through all of them, but probably one of its most exciting use cases is content creation. As you all probably know, the regularly producing fresh content is often expensive and time consuming. But NLP can help you create automated keyword-optimized blog posts, landing pages, emails, reports, and any other marketing content that you will, uh, can think of. Of course, it won’t replace all content creation, but it’s a great start. And another good use of NLP is sentiment analysis with the help of which you can review customer feedback, comments, and any large amount of data and understand your position against your competitors in the social media, for example. Combine this with keyword extraction and top algorithms, and you can get yourself a powerful tool that will let you, for example, analyze your form replies or support tickets, extract main keywords from there, tag them into separate groups, and then automatically forward them to responsible people based on those tags without any people involvement. Imagine the time and money only this particular use case can save you. But enough, uh, of the theory.

Let’s talk about what’s cooking here in Wärtsilä. Here in marketing operation, we have been laying our eyes for a really long time on various exciting possibilities that machine learning can offer us. And after initial research, it was clear that NLP is something that has a lot of potential, especially for email marketing campaigns, as it has a lot of content. Thus, the project idea was born. So the main idea of this project is to improve marketing efficiency by utilizing the historical data from Pardot and CRM campaigns. And in particular, we’re interested in increasing such KPIs as open rate, CTR, and webinar registrations. But, of course, you can have your own. This can be done by detecting the most commonly used keywords, topics, sentiments, and so on and so forth in the highest opened and clicked-through email subject lines and content, as well as potentially discovering possible target groups based on the what email contains. And grouping and analyzing these findings should allow us to produce highly personalized recommendations for our content creators. And this is what lies within the scope of our current project. However, should it be successful and data will prove to be reliable enough, our plan next step is to expand the algorithm used and eventually eventually work on utilizing NLP to auto-generate content for the email campaigns. As I mentioned earlier, possibilities are quite endless. So what does it mean in practice? Uh, in the next slide, that after initial planning, we have come up with a realization that, uh, before diving deeper, we need to first figure out the basics. This means what kind of content our customers actually want to see. So the main idea of this step, this first step, is to utilize a supervised machine learning algorithm. Most likely, it’s going to be the gradient boosting together with a tool called Amazon SageMaker. And to analyze this to analyze email subject lines and content and look for the most commonly used patterns in the highest opened and click-through emails. Such patterns could be, for example, length of text, language, type of campaign, uh, photo or video, and many other ones. Once successful, this model then will be developed further and expanded to utilize more NLP algorithms such as sentiment analysis, text classification, topic modeling, and so on. So combined with the customer segmentation project mentioned earlier by Sujata, this will allow us to generate extremely precise email campaigns target group and content wise.

And so in the next slide, I would like to provide a high-level visual example of what will be done and what kind of results we’re aiming for. As you can see, various sections of an email can give us quite interesting information. For example, we can see if our, uh, customer, it would be prefer would prefer, uh, more formal or informal greeting, if they would like to have statistics mentioned in their email or not, or what kind of they’re interested in. And having analyzed all of this, we are hoping to get an aggregated data that could look like something, uh, like what you can see on the right. And this recommendation then be sent directly to the marketing managers, content creators, and any other involved stakeholders, considerably decreasing the time it takes to produce a piece of content. So the main idea here is to remove unnecessary guessing when creating your marketing content and introduce a highly personalized data-driven solution instead. And so how would you start if you would like to implement something similar in your in your company? Well, first of all, you need to make sure you will you understand your whole email campaign creation process from start to finish. Talk to marketing managers, content creators, project managers, developers. Find out what you do in house and what parts you outsource, what technical tools are used, and so on and so forth. Any information you can get on your hands will be very useful. During this step, it is especially important to understand what pain points you will solve with this project because, uh, all are different and so are the pain points. So make sure you’re not solving somebody else’s pain points. Also, at the same time, you have to set your baseline KPIs that you will use to measure and track success. Once you got this down, outline your ideal email campaign creation process. Here, you need to think carefully about the possible restrictions and bottlenecks of your current CRM systems and also about internal policies, standards, and other things. The same time, understand your time frame and what you can realistically achieve within within that and also set your ideal KPIs, after reaching which you will deem that the project will be successful. Afterwards, starts the data preparation step. I won’t talk about this a lot as Sujata already explained this part in quite a detail. However, one thing I would like to mention is that make sure that you have all necessary data aggregated in one place. Uh, it could be one, uh, it could be a custom database where you have pulled all your marketing data in, but also it could be something else. Creativity here is your limit. Then when your data is ready, you can start preparing and planning how you’re going to go about your machine learning journey. As the developer mantra goes here, don’t reinvent the wheel. In other words, before you try and build that fancy custom machine learning model and spell all your yearly budgets there, first check if there are any available tools or open-source libraries that can help you out. And I can assure you there there are plenty nowadays. Quite often, a combination of them will solve your problem without or with minimal custom coding and investment of your time and money. And last but not least, a word of wisdom. Make sure to your stakeholders that this is a long-term project. This is important. As quite often, many companies would like to see a very quick return on investment, and nobody blames them. But this is not the case here. Machine learning projects can take a lot of time to show even the first initial results, so you are in it for a long haul. And with that, I want to, uh, wish you good luck and finish our presentation for today by thanking you for listening. And if you have any comments or questions, please please, uh, feel free to reach us on LinkedIn or by email. Thank you very much.

Speaker 0: Thank you, Luise. That was great. Um, we have one question in the chat. Uh, you mentioned Amazon SageMaker. What other tools can you recommend?

Speaker 2: Perhaps take this one. Uh, so for example, there are it it all depends on your need, basically. But, uh, one of the most famous ones I can recommend are probably IBM Watson and also, um, Google Cloud Machine Learning Cloud. Those have quite, uh, out-of-the-box solutions for many various needs. But at the same time, if if your problem is something that is smaller and you need to fix in a smaller local problem and you know at least a little bit of coding. Um, Google different Python libraries. There’s plenty of open-source ones, and, uh, I’m pretty sure you’re going to find something.

Speaker 0: I think there are a few other questions, but we’re running out of time. Um, if you wanna follow-up, you can, um, contact either of our presenters at the, um, email addresses on the screen. Uh, we also have, um, some discussion boards at ParDreamin and lots of places to get questions answered. Um, so I wanna thank our presenters and thank our sponsors today for helping to make ParDreamin happening. And, um, I hope everybody enjoys the rest of your day. Thanks.

Speaker 2: Bye.

Speaker 1: Thank you. Bye.