What I Wish Everyone Knew About K Means : Unsupervised Learning

K Means is one of the simplest unsupervised learning methods. Now, question comes what is Unsupervised learning? Lets have quick overview.

Unsupervised learning is the method where we train the model provided training data has no labels / dependent features.

Few keywords and their description before we understand K means:

Features : Training data has different columns and these are considered as features. They are also termed as independent variables.

Labels : Data used for trainings has labels or output variables/features. These are also called dependent variables. They are called dependent as their outcome depends upon the independent variables.


Using ML model and making Data Scientist simpler by packaging to deliver the solution

Developing Machine Learning model, training and deploying is end to end exercise. Here, I will explain how a model can be packaged in such a way that it can be used in your python program effectively and intelligently. It is not as difficult as it should be. Once you are done with your code and your code is ready to be deployed. You will see how much flexibility it can provide you to do the things in more efficient and organized manner. Let us get our hand dirty. I am starting with the below code in setup.py …

Filters have meaning: Context filter is best out of that

Tableau is the new generation visualization tool. I consider it as most easiest one to implement visualization for your audience. It has several features to distinguish it from other existing tools. It is capable of handling all sort of problems for representing data in different format.

Tableau is capable of achieving smart task in smart way. Better to learn and master them

There are certain problems which can’t be handled directly. Or, I would say you have to tweak the current features to achieve your requirement. Here, I will explain you how can we handle the requirement where we want…


Truly based on Data and Problem Statements

Machine Learning has become buzzword in last few years. And, everybody knows about this and likes to experience the same. This is one of the good things happened and thats the reason it is getting more popular. More people explore and then more ideas come. There are many algorithms and models exists. It becomes problem to choose the correct one for your problem statement. Let us experience common steps to explore the problem of choosing ML algorithm and see if we can achieve the standards where chances of wrong selection can be reduced.

Common and Simple Steps before ML models…


Use case and Innovate will make you experience visualization differently

Tableau is the most elaborate and easy to implement visualization tool every experience by professionals. It is one of the in demand tools. I have already explored basics and important topics in below blog. Now, time to see more invented way of changing the existing charts in tableau to your requirement so that you can produce different charts. To your surprise, you will see that it is not as difficult. In single click, you can create existing chart options in tableau and with little bit innovation, we can turn them into required ones.

Tableau is capable of providing the favorite…

Overfitting | Less Data | Data Simulation — Solution is CV

Cross validation is one of the things which can be used to make your training of model more reliable with the given data. It is also known as rotation estimation or out of sample testing. You will understand in a while why is it so!

Cross validation or simply CV can also be referred as out of sample testing or rotation sampling. Once you have model which is not generalizing better with test data or in other words, we can say model is overfitting. So, it means you have less data for model to learn and converge. And, solution is…

Visualization Factor to Understand Data

Tableau is one of the best visualization tools available. It is the truth that till now we had many data analytic tools available but nobody has given so much depth to visually analyze data and present it based on the needs of viewers. It looks revolutionary in terms of dragging and dropping to reach out to beautiful visualization to extract meaning from the raw data. The best thing, I like about new technologies is that they keep evolving and improving. This tool has impact and future so it becomes necessary to learn it.

There are many things and learning is…

Word to Vector : Understanding the Concept

NLP is buzzword and there are plenty of problem statements to experiment with. The more deeper you go, you will get more insights about data. Innovation to explore data is capable of producing new things to solve existing problem statement. We already experienced feature engineering in NLP using tfidf and pmi in earlier blog. Link is below. Now time to move to the next steps to feature engineering.

Feature engineering in NLP can be known as vector compression. And, idea for doing this to get less sparse vector and better performance. Other dimensionality reduction like SVD can be applied too…

Developing Model and Tuning Steps For NLP

NLP stands for natural language processing and it is one of the buzzword in real world. Everybody wants to learn and expertise in this area. To start, NLP is directly correlated to processing text and we all know today’s world is full of text flowing from everywhere. This much data and processing it becomes very interesting. It brings lots of use cases, innovation and ideas to apply machine learning.

With the above discussion, we can understand how important and vast is text processing. It becomes difficult when it is very easy to implement and everybody can write simple lines of…

Master visualization to explore data in first go

Python is one of the new generation languages. It has libraries to visualize your data, explore and get some insights out of the same. Matplotlib is one of the libraries which is used most common in exploring the data. Every data analysis requires to visualize the data for different purpose like finding outliers, density, sparcity, trends and more importantly normalization of data.

Let us explore matplotlib and then we can see what are the other libraries for visualization.

Plotting the data to visualize and extracting first level meaning from the data is must learn art for data analytics and machine…

Laxman Singh

Machine Learning Engineer | Data Science | MTECH NUS, Singapore

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store