Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Image Added

Shivam Razdan is a passionate and driven 3rd year Computer Science and Engineering student at R.V. College of Engineering, Bangalore, India. His journey in the world of technology began with curiosity and has since evolved into a relentless pursuit of knowledge and innovation. With a keen eye for detail and a knack for problem-solving, he finds himself exhilarated by the intricate challenges that the field of computer science presents. Beyond the tech realm, he's an avid enthusiast of both cricket and chess. Cricket lets him unwind and connect with a sense of camaraderie, while chess sharpens his strategic thinking and decision-making skills. Additionally, Shivam finds solace in the world of music.

Poster Abstract

...

Introduction:  

The use of cannabis has become a prevalent concern in society, and understanding the factors that contribute to its usage is crucial for effective intervention and prevention strategies. So, the main aim of this project is to leverage machine learning methods to predict whether an individual is a cannabis user based on features such as personality traits and demographics, to analyze the relationships between these criteria and their likelihood of leading to drug usage and to uncover patterns and associations that could help identify individuals who are more likely to engage in cannabis consumption. The insights derived from this project can inform policy decisions, targeted interventions, and educational programs aimed at mitigating the risks associated with cannabis use. 

The distributed architecture (HPCC systems) that will be utilized in this project offers several advantages. Firstly, it allows us to efficiently handle the extensive amount of data, ensuring that the analysis encompasses a diverse range of individuals and factors. Additionally, by leveraging the parallel processing capabilities of the distributed system, the amount of time required for model training and evaluation will be reduced. 

Objective:  

Identifying and selecting relevant features: The first objective is to carefully analyze and select the most relevant features, including personality traits and demographic information, that have a potential influence on cannabis usage. 

Explore relationships between criteria and drug usage: Using machine learning methods, the aim is to investigate the relationships between each criterion and the likelihood of an individual engaging in cannabis use. 

Develop and train machine learning models: The next objective is to develop and train various machine learning models using the selected features and the dataset. 

Methodology: 

Gathering the Data:  

The dataset used in this project will be "Drug consumption (quantified) Data Set" from the UCI Machine Learning Repository. 

The data contains 1885 number of instances. Each of independent drug label variables contains seven classes: "Never Used", "Used over a Decade Ago", "Used in Last Decade", "Used in Last Year", "Used in Last Month", "Used in Last Week", and "Used in Last Day".   

Manipulating the quantified data. The data set will be grouped into bins. One of the reasons for binning is to force an even distribution of the values. 

Assigning headers to the data. As the headers were initially unknown, by referencing the UCI webpage where explanations for each feature is present, and then assigned the column names and their order, accordingly. 

Extract the columns (features) that will be used with the machine learning model. 

Data Visualization: 

Correlation Matrix 

After normalizing the data and selecting the relevant features, we will proceed to get the correlation between the selected features. A correlation matrix will be created as a heatmap. 

Gradient Boosting Classifier - Will be using Gradient Boosting Classifier to see which features are important. 

Modeling and Evaluation - Various Machine Learning models will be evaluated to check for the highest accuracy model such as: 

  • Linear Regression 
  • Logistic Regression 
  • Support Vector Classifier (SVC)  
  • Ridge 
  • ElasticNet 

Presentation

In this Video Recording, NAME Shivam provides a tour and explanation of his poster content.

...

Cannabis User Prediction based on Personality Traits and Demographics

Click on the poster for a larger image.

Insert Poster ImageImage Added