SlideShare a Scribd company logo
Users categorization of StackOverflow data
Using K-Means clustering Algorithm
Project Presentation
Team Membar – Afzal Ahmad and Abhishek Barnwal
What is StackOverflow ?
• Stack Overflow is a question and answer site
Written in C# for professional and enthusiast
programmers. It's built and run by us as part of
the Stack Exchange network of Q&A sites.
About User Account on stackoverflow
• This site is all about getting answers. Good answers are voted up and
rise to the top .
• User reputation score goes up when others vote up his questions,
answers and edits.
• Badges are special achievements User earns for participating on the
site. They come in three levels: bronze, silver, and gold.
• The person who asked can mark one answer as "accepted".
DataSet Overview
• The dataset is obtained from stackexchange data dump at the
internet archieve.
• The link to the dataset is as follows.
Www.archive.org/details/stackexchange
•Each site under stack exchange is formatted as a separate archive
Consisting of xml file zipped via 7-zip that includes various files.
K-Means Algorithm Implementation In python
Dataset overview
• Stack overflow dataset consists of following files that is treated as table in
our database design.
1.posts
2.postLinks
3.Tags
4.Users
5.Votes
6.Badges
7.Comments
♥ But we are interested only in Users file which contains user's Id and and his
features like age,reputation,upotes,downvotes etc...
K-Means Algorithm Implementation In python
Features of Users Data
1. Age
2. Reputations
3. Upvotes
4. Downvotes
5. Views
Data preprocessing
• Our Dataset is in XML format and unfit for our algorithm to process
that’s why we need data processing to make it fit for our algorithm to
process it.
• Data preprocessing is a data mining technique that involves
transforming raw data into an understandable format.
• To achieve tha data in desired format we need to parse it.
python script to convert xml to csv
from copy import deepcopy
import numpy as np
import pandas as pd
#from matplotlib import pyplot as plt
#%matplotlib inline
#plt.rcParams['figure.figsize'] = (16, 9)
#plt.style.use('ggplot')
import xml.etree.ElementTree as ET
import csv
python script to convert xml to csv
tree = ET.parse("Users.xml")
root = tree.getroot()
# open a file for writing
User_data = open('user_data1.csv', 'w')
# create the csv writer object
csvwriter = csv.writer(User_data)
count = 0
python script to convert xml to csv
csvwriter.writerow(['Reputation', 'Views', 'UpVotes', 'DownVotes', 'Age'])
for i in root.findall('row'):
data = [i.get('Reputation'), i.get('Views'), i.get('UpVotes'), i.get('DownVotes'), i.get('Age') or '0']
# print data
count = count + 1
csvwriter.writerow(data)
User_data.close()
Converted CSV file format
.
What is clustering ?
Clustering is the task of dividing the population or data points
into a number of groups such that data points in the same groups
are more similar to other data points in the same group than
those in other groups. In simple words, the aim is to segregate
groups with similar traits and assign them into clusters.
Pictorial representation of Clustering
Types of Clustering
1. Hard Clustering: In hard clustering, each data point either
belongs to a cluster completely or not.
2. Soft Clustering: In soft clustering, instead of putting each
data point into a separate cluster, a probability or
likelihood of that data point to be in those clusters is
assigned.
Algorithm Used
• We are using K-means clustering algorithm to categorise the user of
different types on the basis of given features.
• k-means clustering is a data mining/machine learning algorithm used
to cluster observations into groups of related observations without
any prior knowledge of those relationships.
• This algorithm is also called unsupervised learning algorithm as it
does not have any idea of label of cluster.
• Using this algorithm we find the different k -categories depending on
the value of K.
Unsupervised Learning
Unsupervised learning is a type of machine learning algorithm used to
draw inferences from datasets consisting of input data without labeled
responses.
The most common unsupervised learning method is cluster analysis, which
is used for exploratory data analysis to find hidden patterns or grouping in
data. The clusters are modeled using a measure of similarity which is
defined upon metrics such as Euclidean or probabilistic distance.
Working of K-Means Algorithm
1 .Specify the desired number of clusters K : Let us choose k=2 for
these 5 data points in 2-D space.
2 . Randomly assign each data point to a cluster : Let’s assign three
points in cluster 1 shown using red color and two points in cluster 2
shown using grey color.
3 . Compute cluster centroids : The centroid of data points in the red
cluster is shown using red cross and those in grey cluster using grey
cross.
4. Now Re-assign each point to the closest cluster centroid .
5. Re-compute cluster centroids : Now, re-computing the
centroids for both the clusters.
6. Repeat steps 4 and 5 until no improvements are possible.
When there will be no further switching of data points between two
clusters for two successive repeats. It will mark the termination of the
algorithm if not explicitly mentioned.
Pictorial representation of K-means
Algorithm
Implementation of K-means Algorithm
1. We have converted our XML data into CSV.
2. Run K-Means Algorithm on stackoverflow data.
3. If K=4 then We get the four cluster center with the values given
below.
array([[ 1.82709702e+02, 8.86936593e-01, 8.58670741e-01,
3.59052712e-02, 3.21581360e+01],
[ 1.71912000e+04, 7.34000000e+01, 1.92800000e+02,
1.29000000e+01, 3.92000000e+01],
[ 3.89650000e+04, 3.47000000e+02, 5.10000000e+02,
8.60000000e+01, 3.00000000e+01],
[ 4.18018750e+03, 1.38750000e+01, 3.42187500e+01,
1.40625000e+00, 3.27187500e+01]])
Pictorial form of Data with 4 cluster centre
Important information regarding insights of
data
1.We processed the data of android users of stack overflow.
2.Here all the results and insights are only of android specific users.
3.We used only numerical value information of User’s as K-Means
algorithm works on Euclidean distance.
4. User’s information used here are as follows.
‘Age’ , ‘Views’ ,’Reputations’, ‘Upvotes’, Downvotes
Insights from stack overflow data
1. Almost all the users of android specific are above 30 in Age.
2. Users who have maximum reputations,views,upvotes and
downvotes are of minimum age among all other users.It means
young community is more involved in android than older.
3. With the growth of Age users are not interested to downvote the
answer. Young community is most involved in downvoting as well as
in upvoting to the answer.
4. Profile views are mostly affected by reputation.It is increasing 3-4
times on doubling the reputation.
.
Thank You

More Related Content

What's hot (20)

A Fast and Dirty Intro to NetworkX (and D3)
A Fast and Dirty Intro to NetworkX (and D3)
Lynn Cherny
 
Elementary data structure
Elementary data structure
Biswajit Mandal
 
Step By Step Guide to Learn R
Step By Step Guide to Learn R
Venkata Reddy Konasani
 
R Programming: Introduction to Vectors
R Programming: Introduction to Vectors
Rsquared Academy
 
Basic of Data Structure - Data Structure - Notes
Basic of Data Structure - Data Structure - Notes
Omprakash Chauhan
 
Data structures
Data structures
Saurabh Mishra
 
Introduction To R Language
Introduction To R Language
Gaurang Dobariya
 
Chapter 6.5
Chapter 6.5
sotlsoc
 
A Primer on Entity Resolution
A Primer on Entity Resolution
Benjamin Bengfort
 
Visualizing the Model Selection Process
Visualizing the Model Selection Process
Benjamin Bengfort
 
Chapter 6.5 new
Chapter 6.5 new
sotlsoc
 
DATA STRUCTURE AND ALGORITHM FULL NOTES
DATA STRUCTURE AND ALGORITHM FULL NOTES
Aniruddha Paul
 
Segment tree
Segment tree
Sindhuja Kumar
 
R basics
R basics
FAO
 
K-means Clustering with Scikit-Learn
K-means Clustering with Scikit-Learn
Sarah Guido
 
Data Structure and Algorithms
Data Structure and Algorithms
Sumathi MathanMohan
 
Segment tree
Segment tree
shohanjh09
 
stacks and queues for public
stacks and queues for public
iqbalphy1
 
Segment tree
Segment tree
Shakil Ahmed
 
K-Means, its Variants and its Applications
K-Means, its Variants and its Applications
Varad Meru
 
A Fast and Dirty Intro to NetworkX (and D3)
A Fast and Dirty Intro to NetworkX (and D3)
Lynn Cherny
 
Elementary data structure
Elementary data structure
Biswajit Mandal
 
R Programming: Introduction to Vectors
R Programming: Introduction to Vectors
Rsquared Academy
 
Basic of Data Structure - Data Structure - Notes
Basic of Data Structure - Data Structure - Notes
Omprakash Chauhan
 
Introduction To R Language
Introduction To R Language
Gaurang Dobariya
 
Chapter 6.5
Chapter 6.5
sotlsoc
 
A Primer on Entity Resolution
A Primer on Entity Resolution
Benjamin Bengfort
 
Visualizing the Model Selection Process
Visualizing the Model Selection Process
Benjamin Bengfort
 
Chapter 6.5 new
Chapter 6.5 new
sotlsoc
 
DATA STRUCTURE AND ALGORITHM FULL NOTES
DATA STRUCTURE AND ALGORITHM FULL NOTES
Aniruddha Paul
 
R basics
R basics
FAO
 
K-means Clustering with Scikit-Learn
K-means Clustering with Scikit-Learn
Sarah Guido
 
stacks and queues for public
stacks and queues for public
iqbalphy1
 
K-Means, its Variants and its Applications
K-Means, its Variants and its Applications
Varad Meru
 

Similar to K-Means Algorithm Implementation In python (20)

15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning
Anil Yadav
 
"k-means-clustering" presentation @ Papers We Love Bucharest
"k-means-clustering" presentation @ Papers We Love Bucharest
Adrian Florea
 
Machine Learning, Statistics And Data Mining
Machine Learning, Statistics And Data Mining
Jason J Pulikkottil
 
Unsupervised Learning.pptx
Unsupervised Learning.pptx
GandhiMathy6
 
Master's Thesis Presentation
Master's Thesis Presentation
●๋•máńíکhá Gőýálツ
 
[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf
4NM20IS025BHUSHANNAY
 
clustering and dataset
clustering and dataset
PiyushGoyal59383
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports Data
KathleneNgo
 
big data analytics unit 2 notes for study
big data analytics unit 2 notes for study
DIVYADHARSHINISDIVYA
 
Neural nw k means
Neural nw k means
Eng. Dr. Dennis N. Mwighusa
 
Clustering in Machine Learning: A Brief Overview.ppt
Clustering in Machine Learning: A Brief Overview.ppt
shilpamathur13
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
nikshaikh786
 
Detailed_KMeans_Unsupervised_Learning_Presentation.pptx
Detailed_KMeans_Unsupervised_Learning_Presentation.pptx
Mansi Sharma
 
clustering using different methods in .pdf
clustering using different methods in .pdf
officialnovice7
 
K_MeansK_MeansK_MeansK_MeansK_MeansK_MeansK_Means.ppt
K_MeansK_MeansK_MeansK_MeansK_MeansK_MeansK_Means.ppt
Nishant83346
 
Clustering
Clustering
DataminingTools Inc
 
Clustering
Clustering
Datamining Tools
 
machine learning - Clustering in R
machine learning - Clustering in R
Sudhakar Chavan
 
Lec13 Clustering.pptx
Lec13 Clustering.pptx
Khalid Rabayah
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
rahuljain582793
 
15857 cse422 unsupervised-learning
15857 cse422 unsupervised-learning
Anil Yadav
 
"k-means-clustering" presentation @ Papers We Love Bucharest
"k-means-clustering" presentation @ Papers We Love Bucharest
Adrian Florea
 
Machine Learning, Statistics And Data Mining
Machine Learning, Statistics And Data Mining
Jason J Pulikkottil
 
Unsupervised Learning.pptx
Unsupervised Learning.pptx
GandhiMathy6
 
[ML]-Unsupervised-learning_Unit2.ppt.pdf
[ML]-Unsupervised-learning_Unit2.ppt.pdf
4NM20IS025BHUSHANNAY
 
Mat189: Cluster Analysis with NBA Sports Data
Mat189: Cluster Analysis with NBA Sports Data
KathleneNgo
 
big data analytics unit 2 notes for study
big data analytics unit 2 notes for study
DIVYADHARSHINISDIVYA
 
Clustering in Machine Learning: A Brief Overview.ppt
Clustering in Machine Learning: A Brief Overview.ppt
shilpamathur13
 
MODULE 4_ CLUSTERING.pptx
MODULE 4_ CLUSTERING.pptx
nikshaikh786
 
Detailed_KMeans_Unsupervised_Learning_Presentation.pptx
Detailed_KMeans_Unsupervised_Learning_Presentation.pptx
Mansi Sharma
 
clustering using different methods in .pdf
clustering using different methods in .pdf
officialnovice7
 
K_MeansK_MeansK_MeansK_MeansK_MeansK_MeansK_Means.ppt
K_MeansK_MeansK_MeansK_MeansK_MeansK_MeansK_Means.ppt
Nishant83346
 
machine learning - Clustering in R
machine learning - Clustering in R
Sudhakar Chavan
 
Unsupervised Learning in Machine Learning
Unsupervised Learning in Machine Learning
rahuljain582793
 
Ad

Recently uploaded (20)

Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Puppy jhon
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
AI and Deep Learning with NVIDIA Technologies
AI and Deep Learning with NVIDIA Technologies
SandeepKS52
 
Open Source Software Development Methods
Open Source Software Development Methods
VICTOR MAESTRE RAMIREZ
 
Plooma is a writing platform to plan, write, and shape books your way
Plooma is a writing platform to plan, write, and shape books your way
Plooma
 
How to Choose the Right Web Development Agency.pdf
How to Choose the Right Web Development Agency.pdf
Creative Fosters
 
SAP PM Module Level-IV Training Complete.ppt
SAP PM Module Level-IV Training Complete.ppt
MuhammadShaheryar36
 
Automated Migration of ESRI Geodatabases Using XML Control Files and FME
Automated Migration of ESRI Geodatabases Using XML Control Files and FME
Safe Software
 
Looking for a BIRT Report Alternative Here’s Why Helical Insight Stands Out.pdf
Looking for a BIRT Report Alternative Here’s Why Helical Insight Stands Out.pdf
Varsha Nayak
 
Migrating to Azure Cosmos DB the Right Way
Migrating to Azure Cosmos DB the Right Way
Alexander (Alex) Komyagin
 
Shell Skill Tree - LabEx Certification (LabEx)
Shell Skill Tree - LabEx Certification (LabEx)
VICTOR MAESTRE RAMIREZ
 
dp-700 exam questions sample docume .pdf
dp-700 exam questions sample docume .pdf
pravkumarbiz
 
Artificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across Industries
SandeepKS52
 
GDG Douglas - Google AI Agents: Your Next Intern?
GDG Douglas - Google AI Agents: Your Next Intern?
felipeceotto
 
Integrating Survey123 and R&H Data Using FME
Integrating Survey123 and R&H Data Using FME
Safe Software
 
Porting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 Webinar
ICS
 
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Alluxio, Inc.
 
FME as an Orchestration Tool - Peak of Data & AI 2025
FME as an Orchestration Tool - Peak of Data & AI 2025
Safe Software
 
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Maxim Salnikov
 
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
BradBedford3
 
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Puppy jhon
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
AI and Deep Learning with NVIDIA Technologies
AI and Deep Learning with NVIDIA Technologies
SandeepKS52
 
Open Source Software Development Methods
Open Source Software Development Methods
VICTOR MAESTRE RAMIREZ
 
Plooma is a writing platform to plan, write, and shape books your way
Plooma is a writing platform to plan, write, and shape books your way
Plooma
 
How to Choose the Right Web Development Agency.pdf
How to Choose the Right Web Development Agency.pdf
Creative Fosters
 
SAP PM Module Level-IV Training Complete.ppt
SAP PM Module Level-IV Training Complete.ppt
MuhammadShaheryar36
 
Automated Migration of ESRI Geodatabases Using XML Control Files and FME
Automated Migration of ESRI Geodatabases Using XML Control Files and FME
Safe Software
 
Looking for a BIRT Report Alternative Here’s Why Helical Insight Stands Out.pdf
Looking for a BIRT Report Alternative Here’s Why Helical Insight Stands Out.pdf
Varsha Nayak
 
Shell Skill Tree - LabEx Certification (LabEx)
Shell Skill Tree - LabEx Certification (LabEx)
VICTOR MAESTRE RAMIREZ
 
dp-700 exam questions sample docume .pdf
dp-700 exam questions sample docume .pdf
pravkumarbiz
 
Artificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across Industries
SandeepKS52
 
GDG Douglas - Google AI Agents: Your Next Intern?
GDG Douglas - Google AI Agents: Your Next Intern?
felipeceotto
 
Integrating Survey123 and R&H Data Using FME
Integrating Survey123 and R&H Data Using FME
Safe Software
 
Porting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 Webinar
ICS
 
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Alluxio, Inc.
 
FME as an Orchestration Tool - Peak of Data & AI 2025
FME as an Orchestration Tool - Peak of Data & AI 2025
Safe Software
 
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Agentic Techniques in Retrieval-Augmented Generation with Azure AI Search
Maxim Salnikov
 
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
BradBedford3
 
Ad

K-Means Algorithm Implementation In python

  • 1. Users categorization of StackOverflow data Using K-Means clustering Algorithm Project Presentation Team Membar – Afzal Ahmad and Abhishek Barnwal
  • 2. What is StackOverflow ? • Stack Overflow is a question and answer site Written in C# for professional and enthusiast programmers. It's built and run by us as part of the Stack Exchange network of Q&A sites.
  • 3. About User Account on stackoverflow • This site is all about getting answers. Good answers are voted up and rise to the top . • User reputation score goes up when others vote up his questions, answers and edits. • Badges are special achievements User earns for participating on the site. They come in three levels: bronze, silver, and gold. • The person who asked can mark one answer as "accepted".
  • 4. DataSet Overview • The dataset is obtained from stackexchange data dump at the internet archieve. • The link to the dataset is as follows. Www.archive.org/details/stackexchange •Each site under stack exchange is formatted as a separate archive Consisting of xml file zipped via 7-zip that includes various files.
  • 6. Dataset overview • Stack overflow dataset consists of following files that is treated as table in our database design. 1.posts 2.postLinks 3.Tags 4.Users 5.Votes 6.Badges 7.Comments ♥ But we are interested only in Users file which contains user's Id and and his features like age,reputation,upotes,downvotes etc...
  • 8. Features of Users Data 1. Age 2. Reputations 3. Upvotes 4. Downvotes 5. Views
  • 9. Data preprocessing • Our Dataset is in XML format and unfit for our algorithm to process that’s why we need data processing to make it fit for our algorithm to process it. • Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. • To achieve tha data in desired format we need to parse it.
  • 10. python script to convert xml to csv from copy import deepcopy import numpy as np import pandas as pd #from matplotlib import pyplot as plt #%matplotlib inline #plt.rcParams['figure.figsize'] = (16, 9) #plt.style.use('ggplot') import xml.etree.ElementTree as ET import csv
  • 11. python script to convert xml to csv tree = ET.parse("Users.xml") root = tree.getroot() # open a file for writing User_data = open('user_data1.csv', 'w') # create the csv writer object csvwriter = csv.writer(User_data) count = 0
  • 12. python script to convert xml to csv csvwriter.writerow(['Reputation', 'Views', 'UpVotes', 'DownVotes', 'Age']) for i in root.findall('row'): data = [i.get('Reputation'), i.get('Views'), i.get('UpVotes'), i.get('DownVotes'), i.get('Age') or '0'] # print data count = count + 1 csvwriter.writerow(data) User_data.close()
  • 13. Converted CSV file format .
  • 14. What is clustering ? Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group than those in other groups. In simple words, the aim is to segregate groups with similar traits and assign them into clusters.
  • 16. Types of Clustering 1. Hard Clustering: In hard clustering, each data point either belongs to a cluster completely or not. 2. Soft Clustering: In soft clustering, instead of putting each data point into a separate cluster, a probability or likelihood of that data point to be in those clusters is assigned.
  • 17. Algorithm Used • We are using K-means clustering algorithm to categorise the user of different types on the basis of given features. • k-means clustering is a data mining/machine learning algorithm used to cluster observations into groups of related observations without any prior knowledge of those relationships. • This algorithm is also called unsupervised learning algorithm as it does not have any idea of label of cluster. • Using this algorithm we find the different k -categories depending on the value of K.
  • 18. Unsupervised Learning Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses. The most common unsupervised learning method is cluster analysis, which is used for exploratory data analysis to find hidden patterns or grouping in data. The clusters are modeled using a measure of similarity which is defined upon metrics such as Euclidean or probabilistic distance.
  • 19. Working of K-Means Algorithm 1 .Specify the desired number of clusters K : Let us choose k=2 for these 5 data points in 2-D space.
  • 20. 2 . Randomly assign each data point to a cluster : Let’s assign three points in cluster 1 shown using red color and two points in cluster 2 shown using grey color.
  • 21. 3 . Compute cluster centroids : The centroid of data points in the red cluster is shown using red cross and those in grey cluster using grey cross.
  • 22. 4. Now Re-assign each point to the closest cluster centroid .
  • 23. 5. Re-compute cluster centroids : Now, re-computing the centroids for both the clusters.
  • 24. 6. Repeat steps 4 and 5 until no improvements are possible. When there will be no further switching of data points between two clusters for two successive repeats. It will mark the termination of the algorithm if not explicitly mentioned.
  • 25. Pictorial representation of K-means Algorithm
  • 26. Implementation of K-means Algorithm 1. We have converted our XML data into CSV. 2. Run K-Means Algorithm on stackoverflow data. 3. If K=4 then We get the four cluster center with the values given below. array([[ 1.82709702e+02, 8.86936593e-01, 8.58670741e-01, 3.59052712e-02, 3.21581360e+01], [ 1.71912000e+04, 7.34000000e+01, 1.92800000e+02, 1.29000000e+01, 3.92000000e+01], [ 3.89650000e+04, 3.47000000e+02, 5.10000000e+02, 8.60000000e+01, 3.00000000e+01], [ 4.18018750e+03, 1.38750000e+01, 3.42187500e+01, 1.40625000e+00, 3.27187500e+01]])
  • 27. Pictorial form of Data with 4 cluster centre
  • 28. Important information regarding insights of data 1.We processed the data of android users of stack overflow. 2.Here all the results and insights are only of android specific users. 3.We used only numerical value information of User’s as K-Means algorithm works on Euclidean distance. 4. User’s information used here are as follows. ‘Age’ , ‘Views’ ,’Reputations’, ‘Upvotes’, Downvotes
  • 29. Insights from stack overflow data 1. Almost all the users of android specific are above 30 in Age. 2. Users who have maximum reputations,views,upvotes and downvotes are of minimum age among all other users.It means young community is more involved in android than older. 3. With the growth of Age users are not interested to downvote the answer. Young community is most involved in downvoting as well as in upvoting to the answer. 4. Profile views are mostly affected by reputation.It is increasing 3-4 times on doubling the reputation.