Predictive Analytics On IPL - Indian Premier League - IJCRT
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
Predictive Analytics On IPL
Indian Premier League
1
Gulladurti Tarunkanth Reddy, 2Beeravalli Vamshi Reddy, 3Routhu Praveen, 4Majji Naveen Kumar
1
Student, 2Student, 3Student, 4Student
1
Computer Science & Engineering,
1
Lovely Professional University, Jalandhar, Punjab
Abstract: In cricket, the twenty20 format is most watched and cherished by individuals, where nobody can predict about who
will dominate the game until the last ball of the match. In India, The Indian Premier League (IPL) has begun from 2008 and now
it is the most mainstream T20 league in Asia. We predict the matches by using ensemble techniques models, for the output and
also compared with other models like Random forest regressor and other various models. we are using different python libraries
and frameworks to predict the probable output, also using tableau platform for developing different analysis and dashboards till
now and for the user interface we are using HTML, CSS languages for the better user experience.
Index Terms -Hypertext Markup Language, Cascading Style Sheets, Tableau, Python, Machine Learning, Data Science.
I. INTRODUCTION
Cricket is the most well-known game in Asian nations, a huge number of players are played across the seas and oceans of various
nations during a year and World Cup is played once in every four years across all the cricket playing countries. Cricket is played
in various arrangements like One Day International (ODI), T20 and Test matches. Aside from this, many matches at state level
and district level are played inside the country. For any such matches, or World Cup matches an appropriate group of playing 11
and 4 additional players should be chosen to form a team. Cricket crew comprises with all requirements of batsmen and bowlers
with one wicket keeper who can likewise bat or bowl. The selectors and team commander needs to choose batsman and bowlers
in the group with a wicket keeper. Every batsman in the team will be particular to bat at an alternate situation in the playing
eleven and there are assortments of bowlers like spin, fast and medium fast in the team.
IPL would be a most professional Twenty20 cricket league in India where it is conducted between March and May of each year
by eight teams representing eight different cities or states in India. The IPL cricket league was founded by BCCI in 2007. It’s an
exclusive window in International Cricket Council Future Tours Programme. It’s the foremost cricket league within the world and
in 2014 it had been ranked sixth by average attendance among all sports tournaments where in 2010, it became the primary
sporting event within the world to be broadcast live on YouTube. According to Phelps, the IPL league's brand value in 2019 was
475 billion (US$6.7 billion). Whereas, according to the BCCI, the IPL season added $11.5 billion (US$160 million) to the Indian
economy's GDP in 2015. There are 13 seasons of the IPL tournament. The present IPL title winners are Mumbai Indians, who
won the IPL tournament cup in 2020 season. The venue for the 2020 season was shifted to United Arab Emirates due to
the COVID-19 pandemic situations.
The Predictions & Analysis are done on the basis of the viewers where every viewer is not able to guess who will be the actual
winner for a particular match in IPL and also, they are not able to analyse the data of all previous matches of a particular team. So
that on regarding this problem, we had developed the predictions. As in below we have given different sections like User
Interface, Data Cleaning, Data Analysis, Predictions, etc where in each and every section there will be the work done by us for
predicting the matches, scores and also analysing the past data of IPL with various charts.
II. LITERATURE REVIEW
We have studied many journals, some journals are based on ML models which are deployed where some of them are only the ML
models. In this type, the user should specify the exact value of input where the model takes and gives the output within that
particular coding application. And some of journals didn’t mentioned about score prediction. So that we have decided to develop
a web based IPL Prediction project because the users feels comfortable in it and score prediction because not all the viewers are
eager to the winner of the match where some of the viewers might be eager for overall score.
III. PROPOSED FRAMEWORK
We have used different frameworks and libraries of python to develop a ML model, also we had developed a User Interface
where the users are able to get quick analysis and predictions with the help of HTML, CSS-Bootstrap. And also used MYSQL to
connect the database. So, that if a user wants to see something in our website, he needs to login first or register in the web site.
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5652www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
3.1 User Interface
The Interface is given using HTML, CSS. Here there will be different pages with different contents:
Page Home – It contains the Navigation bar with all different names where onclick on this it will redirect to other page. The page
has different images which changes after every 2 sec with help of slider as shown in fig. 1. (a).
fig. 1. (a) Home Page
Page Contact Us – It has the details of all the team members who has worked on the project as shown in fig. 1. (b).
Fig. 1. (b) Contact Us Page
Page Feedback – The users are able to give the feedback on the feedback page related to the interface whether it is good or not.
Pages of Data Analysis – Here, there are 5 to 6 pages under Data analysis with different analysis. The user can filter the details for
example with respective to year he can see the results as shown in fig. 1. (c).
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5653www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
fig. 1. (c) Teams and their Average runs per Match
Pages of Dash Boards – Two or more analysis are combined to create a dashboard. The user can filter the dashboard which
reflects with all analysis present in dashboard.
Pages of Predictions – Users can give their inputs as shown in fig. 1. (d) within the interface where the output will be reflected
from the ML model through backend part. So, that the users can get required predictions based on their inputs.
fig. 1. (d) Win Prediction
3.2 Data Collection
Data has been collected from Kaggle website. It is related to all players and their performance and the matches which
took in the place of winning or losing. The data has been cleaned using tableau platform where Deccan charges and Delhi
Daredevils has combined with the current teams SunRisers Hyderabad and Delhi Capitals. Also, a particular data has been
collected from Kaggle for score prediction.
3.3 Data Visualization
Data Analysis and Visualizations has been performed using the tools of tableau and performed different view of analysis
and some of them are,
Analysis
Average runs per match, - Visualization has been developed with, runs scored by each team in every match and made an average
runs per match by every team as shown in fig. 2. (a).
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5654www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
fig. 2. (a) Teams and their Run per Match
Match Details, - The analysis has developed using donut chart by placing the attributes won by batting first, won by chasing, tied
with the exact percentage based on previous matches as shown in fig. 2. (b).
fig. 2. (b) Match Details
Status of Teams, - Every teams winning and losing has been analysed and shown in a table with total played, wins and losses as
shown in fig. 2. (c).
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5655www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
fig. 2. (c) Status of Team
Most Runs in IPL/Player, - Most runs scored per player in every session of ipl with adding year in filters as shown in fig. 2. (d).
fig. 2. (d) Most Runs Scored in IPL
3.4 Dashboards of Data analysis
Dash boards are created by combining 2 or more tableau sheets and placing them in a new sheet,
Stats Dashboards, - This dashboard consists both orange cap and purple cap players along with year in filter as shown in fig. 3.
(a).
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5656www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
fig. 3. (a) Stats Dashboard
There are some other dash boards which are in process of developing.
3.5 Predictions
Predictions are developed using Ensemble Techniques Algorithm where the prediction accuracy is high as compared to the other
prediction Algorithms.
Ensemble Techniques Algorithm
It helps to improve the predictions of machine learning by combining many models. The process gives the production of best
predictive performance as compared to a single model.
These are meta algorithm which combines many Machine Learning Algorithms or techniques into one predictive Algorithm to
make decrease variance (bagging), bias (boosting), or improve predictions (stacking).
These can be divided into two parts. They are: -
1. Sequential ensemble techniques (generated sequentially) – used to exploit the dependence between the base learners.
Ex: AdaBoost Algorithm.
2. Parallel ensemble techniques (generated in parallel) – Used to exploit independence between the base learners.
Ex: Random Forest Algorithm
Bagging: - Bagging stands for bootstrap aggregation.
Boosting: - It refers to a family of algorithms that are able to convert weak learners to strong learners.
Stacking: - It’s an ensemble technique that uses a meta classifier or a meta regressor to combine multiple classification or
regression models. The meta model is trained on the outputs of the base level model as functions, after which the base level models
are trained on a full training set.
Predictions are made for Wining match, Score predicting, and more. In this, the parameters of score are considered of
batting team, bowling teams, venue, current over, current score, wickets fallen, runs scored in last 5 overs and wickets taken in
last 5 overs. And also, for win prediction home team, away team and venue. In this format the user inputs have been given and
according to that the output will be predicted.
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5657www.ijcrt.org © 2021 IJCRT | Volume 9, Issue 3 March 2021 | ISSN: 2320-2882
IV. RESULTS
The Predictions are made by using Ensemble Techniques Algorithm and the accuracy for this is 81%. The columns from Table
1.1 are given as inputs of the users except Winner where winner is the output that is predicted. We have predicted over 30
matches where the 6 matches mentioned below in Table 1.1 are some of them.
Table 1.1
S. No Home Away Venue Predicted
team team Winner
1 CSK MI Home_CSK CSK
2 MI RCB Home_RCB MI
3 SRH CSK Home_SRH CSK
4 KXIP DC Home_DC DC
5 KXIP CSK Home_KXIP CSK
6 MI SRH Home_MI MI
Here, Score Prediction is made by using Ensemble Techniques Algorithm where the accuracy is 80% and compared this
algorithm with Random Forest Regression Model where the accuracy is more with Ensemble Techniques Algorithm and the
columns from Table 1.2 are given as inputs by user except Predicted Score where it is the actual output that is predicted. We have
predicted over 20 team scores where the below Table 1.2 shows some of them.
Table 1.2
Batting Bowling Venue Current Current Wickets Runs Wickets Predicted
Team1 Team2 Over Score Fallen Scored fallen Score
Last 5 Last 5
overs overs
CSK RCB 1 15.2 130 2 30 1 180
MI SRH 2 10.4 110 3 25 2 184
RR KKR 1 16.5 150 5 15 2 173
KXIP DC 1 9.5 85 5 12 3 155
These are the predictions of our project and some of the predictions are not mentioned in the paper.
V. CONCLUSION
Our work will be broadened further by expanding our dataset so it not just incorporates the games directed in the IPL, yet
additionally coordinates from other celebrated Cricketing occasions, for example, the Big Bash League just as matches from
International Cricket. We can tell that the algorithms we used are going to produce precised output and there will be a speed
interaction between a customer and website like the data analysis and predictions which takes more time in Tableau and
Anaconda Jupiter where it takes less time in the web development using backend and frontend.
VI. REFERENCES
[1] Shubhra Singh and Parmeet Kaur, IPL Visualization and Prediction Using HBase, 2017.
https://www.sciencedirect.com/science/article/pii/S1877050917327023
[2] Kasukurti Raviteja, Ganesh Kumar Macha and Dr. GR Anantharaman, Predicting and Analysing the Performance of the IPL
Cricket Using Regression Models, 2019.
http://cij.org.in/pdf/CIJ-23-03-0033.pdf
[3] Gil Fried, Ceyda Mumcu, Sport analytics is a data-driven approach to the business and management of sports, 2016.
https://doi.org/10.4324/9781315619088.
[4] Heaton J, An empirical analysis of feature engineering for predictive modelling, 2016.
https://doi.org/10.1109/SECON.2016.7506650.
[5] Lamsal R, Choudhary A. Using Machine Learning to Predict the Future outcomes of Indian Premier League (IPL) Matches,
2018.
https://arxiv.org/abs/1809.09813
[6] IPL website... Available from: https://www.iplt20.com.
[7] Ensemble Techniques Algorithm with Better Prediction in Machine Learning, Available from:
https://blog.statsbot.co/ensemble-learning-d1dcd548e936
[8] Indian Premier League, Wikipedia Available form:
https://en.wikipedia.org/wiki/Indian_Premier_League
[9] Dibyojyoti Bhattacharjee and Priyanka Talukdar, Using the pressure index to estimate match results: facts from 2020 cricket,
Communications in Statistics - Simulation and Computation, 2019.
https://doi.org/10.1080/03610918.2018.1532003.
[10] Kaggle... Available from: https://www.kaggle.com/manasgarg/ipl.
[11] ESPN CricInfo... Available from: https:/stats.espncricinfo.com/.
[12] Anurag Sinha, Application of ML model in Cricket and Predictive Analytics of IPL 2020, 2020.
https://www.preprints.org/manuscript/202010.0436
[13] Jayanth, Sandesh Bananki, Anthony Akas, Abhilasha Gududuru, Shaik Noorni and Srinivasa Gowri, For the game of cricket,
a squad recommendation method and result prediction, 2018
https://content.iospress.com/articles/journal-of-sports-analytics/jsa196
IJCRT2103664 International Journal of Creative Research Thoughts (IJCRT) www.ijcrt.org 5658You can also read